Lloyd Brown

Research Blog

I am currently focusing on the storage fabric design of Resource Disaggregation. I am delighted to be a part of the CREU program for the 2017-2018 academic year.

Week 35: April 30 - May 6

Work Accomplished: This week I worked on the final report for the CREU project. I thought about where I started and where I ended and all of the work I put in over the course of the program. I thought about how this experience affected me as a researcer and how I would carry the skills I gained throughout the rest of my academic career.

Outcomes: After reflecting on this past year's work I am happy with the result of the work I put in. This program was a great introduction to the field of research and taught me a lot of lessons. This year made me more confident in my ability to lead a research project from the front and I think this is invaluable for an undergraduate who intends to pursue academic research as a career. This project also taught me that you have to be willing to perservere when the results of your work are not exactly as expected. Research does not always go the way you plan so you have to be flexible and consistent. I believe that my participation in this project will have a lasting affect on me academically.

Week 34: April 23 - April 29

Work Accomplished: This week I started to merge together the different implementations of the data structures I decided would be useful for databases. I thought about the testing suites of the implementations and decided to conform them to the one that tested in an environment most similar to real databases. I also looked at the testing results and tried to begin implementation with the knowledge of the implementations that provide the best results.

Outcomes: By leveraging the positives of these different implementations I will be able to create a testing environment that is indicative of the real-world environment. Testing results are meaningless if they bare little relation to the actual environment. Once this work is complete I will be able to evaluate machine learning models that provide the same properties as these data structures.

Goals: Next week I plan to work on the final report for the CREU program.

Week 33: April 16 - April 22

Work Accomplished: This week I finalized my decision on what I think are the data structures that provide the basic operations needed in databases. These data structures are space-efficient and have reasonable performance. Given these characteristics I believe that these models will be competitive with any machine learning models that serve to provide the same functionality. I also took a look at a couple implementations of these datasets.

Outcomes: The work these past two weeks has given me tools whose performance I believe in. In the future when I need to compare potential machine learning models with their peers that perform the same task I can be confident in the contrasts the performances give me. This will allow me to generalize with confidence the usefulness of any given model. Looking at the different implementations of these data structures will allow me to eventually combine them and have a central place where I can evaluate performance.

Goals: Next week I plan to get started on combining these various implementations together.

Week 32: April 9 - April 15

Work Accomplished: This week I did an extensive survey of the different data structures that are used to perform the basic tasks needed within a database. I performed this work in a similar manner to the exploration I did recently. I thought about the basic guarantees that a database needs to provide and thought about the data structures that were paramount to those tasks. I also made sure to look at different variants that perform the same tasks as to choose the most effective structure.

Outcomes: The ultimate goal of this work is to provide an understanding of the performance of different data structures and relate it to machine learning models that provide the same functionality. By choosing the best data structures I can be sure that my implementation is competitive. This week gave me a wealth of understanding about the preliminary mechanisms considered as well as variants that may be more effective.

Goals: Next week my goal is to finalize my decision on my choices of structures and then begin an implementation that allows me to evaluate performance.

Week 32: April 2 - April 8

Goals: This week is spring break so I will be away from Cornell for the time being.

Week 31: March 26 - April 1

Work Accomplished: This week I went through the potential research directions and made sense of the ideas I have in mind. After looking at the pairings I came up with what I think were the best possibilities. I made sure to think about the unique usefulness that the data structure provides and also how the machine learning model I am thinking about can make it even better. I also considered how this connection would perform on various types of datasets with various types of access patterns.

Outcomes: Now that I have settled on a more concrete problem I have a good understanding of the work that needs to be done in the near future. With this problem in mind I will start out trying to recreate results for similar papers and leverage the understanding I get from this work to improve my research in this area.

Goals: Next week is spring break so I will be away from Cornell for the time being.

Week 30: March 19 - March 25

Work Accomplished: This week I looked deeper into the replacement of traditional data structures using machine learning models and came up with some preliminary ideas for a research direction. I thought about the data structures that are known to be lacking in some areas and thought about the reason behind their poor performance in those situations. I then thought about the strengths and weaknesses of other machine learning models and tried to pair these data structures and machine learning models in order to mitigate any weaknesses.

Outcomes: This has given me a list of potential pairings to choose from which will allow me to choose a more fruitful research direction. It has also given me a view of where data structures and machine learning models thrive holistically.

Goals: Next week I will think more about these pairings and try to pick a couple to focus on for the near future.

Week 29: March 12 - March 18

Work Accomplished: This week I read more literature in the area of machine learning. I took a look at an interesting way to use machine learning in order to improve the functionality that hardware provides. I also finished looking at the data structures I started to study recently. I made sure to understand their basic characteristics and think about the guarantees we would like to provide if we were to replace them with a machine learning model.

Outcomes: This weeks work has given me a great understanding of the possible data structures I would like to replace. I feel confident in deriving some constraints for a possible machine learning algorithm. The innovative machine learning work I studied has also inspired me to pursue this direction. I am excited to explore the possibilities.

Goals: Next week I will focus on the data structure I think would best benefit from a machine learning algorithm and will use recent work to inspire a research direction.

Week 28: March 5 - March 11

Work Accomplished: This week I did a survey of a couple of different data structures relating to indexing. I first took a look at their base implementations and then I compared them to their more optimized versions. While examining these implementations I thought about the performance guarantees each of these data structures provide. I also made sure to think about how a machine learning model might improve upon or worsen the performance. The last thing I did was to determine the characteristics that would need to be kept in a switch to a machine learning model.

Outcomes: As a result of my work this week I have a much better understanding of these various data structures. I took the time to understand the intuition behind them as well as when they perform well. I also have a much better understanding of the work that has been done in the field to optimize them. Now that I have a sense of how people are using them today I am better equipped to determine how a machine learning model could replace or enhance them.

Goals: Next week I will finish up my analysis of these data structures and look at more literature on related machine learning models.

Week 27: February 26 - March 4

Work Accomplished: This week I started to pursue a different perspective; new directions have emerged on using machine learning models to more efficiently use memory. I read a paper discussing this idea as an alternative to traditional data structures such as hash maps. This leads to interesting questions about the true strength of machine learning models and how to leverage their unique predictive abilities. I also thought about the weaknesses of traditional data structures and where machine learning models could improve upon them.

Outcomes: Given my survey of this topic I found I was interested in this research area, I plan to delve deeper into the questions I discussed and to get a more holistic view of machine learning models as a whole.

Goals: Next week I will continue to learn more about traditional data structures and the work that has been done to optimize them. With these optimizations in mind I will then consider the pros and cons of their partial or full replacement by a machine learning model.

Week 26: February 19 - February 25

Work Accomplished: This week I began to work on the design for the scheme I discussed last week. I looked at the Redis code and thought about where it would be efficient to place the per-user information I need to store. I also thought about how I would integrate this information with decisions the server will need to make, in order to decide whether or not to service particular commands. I also thought about the implications this would have on the performance of individual commands and how best to keep performance similar to the original case.

Outcomes: Given this work it is likely I will have to keep this information global and edit it upon each new connection. As a user accesses data I will need to update its respective pair. I also need to decide what to do with cache used up by users no longer in the system, if I choose to relinquish this space then I will need to keep track of what data each user owns which could negatively affect performance.

Goals: Next week I will work to outline the details of this design, and think about its performance implications.

Week 25: February 12 - February 18

Work Accomplished: This week I decided that the only way to sensibly store information about cache usage, by user, is to keep pairs of user id and cache owned. After settling on this scheme I thought about how it might be implemented and the possible consequences of certain implementations. I also thought about how this scheme would perform in the common case of many users.

Outcomes: Having settled on a scheme it became easier to think about the effects of my design choices. Since I would be storing data proportional to the number of users I had to make sure that these pairings would not take up too much space. Also since the number of users who have touched the system at any point before a particular time could be indefinitely large, it made sense to throw out pairings of users no longer in the system.

Goals: Next week I will put more work towards this implementation and make sure to implement it with scalability in mind.

Week 24: February 5 - February 11

Work Accomplished: This week I began to think about possible ways to store information relating to users accesses to the database. I assessed the pros and cons of a couple potential schemes. I made sure to think about the scalability of each solution given that we can expect to see a large number of users at any given time. I also thought about the potential effects on the responsiveness of the server given the different potential update times.

Outcomes: This work helped me understand the different requirements for an effective means of keeping track of transactions with users. First of all, the data stored should not be unbearably large given a large number of users. Second of all updating this data should not have a significant effect on the speed with which the server interacts with its clients.

Goals: Next week I plan to settle on a scheme and begin its implementation. After the implementation is finished I plan on using the load tester previously developed to ensure the model does not harm performance.

Week 23: January 29 - February 4

Work Accomplished: This week I studied the characteristics of the server we are using to cache data in order to benchmark performance of each user. I started by looking at how the server views each connection to the different clients. I then took a look at the mechanism used to allow clients to store data.

Outcomes: Given this work I was able to develop a better understanding of the information that the server maintains on a per client basis. I also learned how the server deals with each request and was able to think about how we might determine whether or not we want to fulfill each request.

Goals: The ultimate goal behind this work is to recreate results of papers that discuss efficient caching by using information such as: the user wanting to store data, or how much space that user already has alloted to them.