Friday, December 9, 2011

Improving Food Lookup

I keep running into a road block with getting my database to insert new meal entries, up until now, I have been mostly able to simply copy preexisting database entries over to the local db, but for meals I need to create a brand new entry. I have no idea why this is not working, though I at least have figured out that I have no problem creating new entries for the other tables, so it must have something to do with the specific table definition. Once again I pushed this task back, though now that I have a clue I might be able to crack the issue next week.

Instead, I jumped right into improve search functionality. I skimmed the papers quickly, and it seems that I might need more time than I have to implement them, instead I attempted to use fairly basic heuristics for sorting results. It was actually quite successful.

I initially had gathering results simply based on whether or not the contained any of the words in the search query. Obviously this wasn't the best way to do it but it was a good functionality test. As you can see bellow, the results aren't very good.

I next tried ranking results based on how much they matched the query. I simply counted the letters in all the query words that existed in each food name. I then normalized the ranks by name length. The results were quite a bit better, but they were still not quite what I wanted. I was getting results that matched both words at the top of the query, but I was missing some results that started with the first query word. For instance bellow, you will notice that there are no top results beginning with butter (butter is actually at the top of the unsorted list too so it was definitely in the db).

My last and current method employed some weighting based on the order of the words in the query and in the word. I calculate three weights:
  • the index in the word that the query word is found as a substring
    • The index is normalized by the length of the word, and inverted to rank earlier indices higher.
    • Eg. The query "butter" for the word "butter" gets a index rating 1.0, while "utter" gets an index rating of 0.8333 (1 - 1/6)
  • the matched words position in the foods name (the first word in the food name is rated higher than the second).
    • The word position ranking is then normalized by the total number of words in the food name.
  • the percentage of the word that the query string matches
I then multiply the three weights together to form the rank, and sort to get the highest ranked words. The results were excellent. I was testing briefly with Elliot and was able to find whatever food in just a few letters entered.


The search works great, but there still remains the problem that it is a fair bit slow. Right now it will search every time a letter of the query string is modified, so the ranking is run very often. I will look into how easily I can launch a timer, and only search every ~0.5 seconds or so if the query string is updated.


Next week will probably be my last week of real development before the end of the semester. I forgot to factor in time for making my video and write-up when I was setting goals last week, so I will need to reserve the last few days for that. On the plus side though, next week is reading days, so I will have more time to code (in theory at least). I will aim to clean up the nutrition history visualization page. Hopefully I can get meals to work, but I need to vow not to waste too much time on it or nothing else will get done. The next thing after that will probably be hard-coding in a recommended daily values table to get the visualization information normalized properly. After that I will clean up the UI and clean up the code a little bit so it isn't too bad for Norm and Joe to review.

Sunday, December 4, 2011

Beta Review and The Last Three Weeks

I met with Norm and Joe for my Beta Review on Monday. I think that things went much better than the Alpha Review, and I am much happier with my progress. Somethings that Norm and Joe said that would like to see was better ease of use. Norm though it would be good improve the food search functionality and improve the results possibly using something similar to page rank and maybe caching results. It would also be good to improve the ordering of the nutrient listings to better reflect what users want.

My goals for the remainder of the semester are as follows:

Next Week (12/5 - 15/11)

Reading Week (12/12 - 12/18)
  • Speed up food search using some kind of ranking algorithmic
    • If the techniques described in the above two articles are impractical given time constraints, I will fall back to just sorting by the percentage of the word the search string matches.
  • Sort nutrient information based on what is most important
Finals Week (12/19 - 12/21)
  • If search needs refinement, I will spend the rest on my time refining the algorithm
  • If search is well done, I will instead work on adding another, feature perhaps specifying a personal diet goal, or adding in a preview view before permanently adding in a food item.

Friday, December 2, 2011

My Current Status Report and Self Evaluation

This is basically a post analyzing my current status on my project and evaluating how successfully I accomplished my goals. I think it is best to start out listing my original goals and highlight the status of each feature. Hopefully by categorizing the goals also I can get a better idea of what I did well, and what still is lacking.

X = Completed (No major additions need)
x = Mostly Complete (Functional, but may need small tweaks/additions)
/ = In Progress (Started, but not really functional for practical use)
  = Not Started
- = Originally Planned but Postponed
* = Future feature never planed in timeline


User Interface
[x] Add Food Page (still needs option for number of servings)
[X] Page Displaying Day's Intake of Food
[X] Page Displaying Total Nutrition For Day
[*] Nutrition Facts Page
[*] Preview Page (preview effect of food on day's nutrition)
[-] Barcode Scanning
[*] Voice Input


Database Management
[X] Copy Over Consumed Information to Local DB
[x] Add meals to local DB (need to set dates and servings)
[-] Scrape in Food Product Information
[*] Setup on Server for Main DB


Usability Features
[/] Food Lookup
[ ] Food Search Personalization (Give your common meals higher priority in search)
[ ] Filter preferred nutrients (Only visualized specified nutrients)
[-] UPC Lookup


Analysis Features
[/] Nutrition Visualization (have the nutrition calculated, but still need improvements in presentation order)
[ ] Personalized Goals


In retrospect, I think that I am in decent shape. I did underestimate how much time it would take to get the database and the basic framework up and running. This was definitely the biggest bottleneck during the early stages of development. In order to get caught up, I had to cut out UPC Lookup and Database scraping. My main reasoning for cutting them out was because they would have required even more work hacking two different databases to combine together. I did not give enough time for either of these, and so I think that it was the right decision to postpone them. Otherwise, I may have had a database with a wealth of data, but no functionality to show for it.


After my Alpha Review, the general consensus was that I didn't have enough to show for my work. I then decided that I needed to refocus on some immediate usable features. I refocused my efforts on hammering out the UI, and getting actual data interaction. I think that this was a very beneficial decision. I now have something I can actually interact with and test that shows concrete results. Even though some of the implementation is not finished, I have a much better idea of what works, and what parts of my initial assumptions were correct and which were wrong.


There were times during development that I think I lost focus of what I wanted to do. I set out with the mind set that this would be a project that I would continue even after the semester ends (which I still) do. However, I think that clouded my thoughts when it came to making decisions on what need to be done, because I got caught up on my long term goals, and didn't spend enough time implementing the short term goals. In terms of my project's full life time, the distribution of my efforts was probably more reasonable. However, my early approach was not the best for getting a working prototype up in just one semester.


I am pretty confident with my current direction now, and I think I have a good focus for the remaining three weeks. I will go into more detail about the specific features that I have chosen in my next post.