Friday, December 9, 2011

Improving Food Lookup

I keep running into a road block with getting my database to insert new meal entries, up until now, I have been mostly able to simply copy preexisting database entries over to the local db, but for meals I need to create a brand new entry. I have no idea why this is not working, though I at least have figured out that I have no problem creating new entries for the other tables, so it must have something to do with the specific table definition. Once again I pushed this task back, though now that I have a clue I might be able to crack the issue next week.

Instead, I jumped right into improve search functionality. I skimmed the papers quickly, and it seems that I might need more time than I have to implement them, instead I attempted to use fairly basic heuristics for sorting results. It was actually quite successful.

I initially had gathering results simply based on whether or not the contained any of the words in the search query. Obviously this wasn't the best way to do it but it was a good functionality test. As you can see bellow, the results aren't very good.

I next tried ranking results based on how much they matched the query. I simply counted the letters in all the query words that existed in each food name. I then normalized the ranks by name length. The results were quite a bit better, but they were still not quite what I wanted. I was getting results that matched both words at the top of the query, but I was missing some results that started with the first query word. For instance bellow, you will notice that there are no top results beginning with butter (butter is actually at the top of the unsorted list too so it was definitely in the db).

My last and current method employed some weighting based on the order of the words in the query and in the word. I calculate three weights:
  • the index in the word that the query word is found as a substring
    • The index is normalized by the length of the word, and inverted to rank earlier indices higher.
    • Eg. The query "butter" for the word "butter" gets a index rating 1.0, while "utter" gets an index rating of 0.8333 (1 - 1/6)
  • the matched words position in the foods name (the first word in the food name is rated higher than the second).
    • The word position ranking is then normalized by the total number of words in the food name.
  • the percentage of the word that the query string matches
I then multiply the three weights together to form the rank, and sort to get the highest ranked words. The results were excellent. I was testing briefly with Elliot and was able to find whatever food in just a few letters entered.


The search works great, but there still remains the problem that it is a fair bit slow. Right now it will search every time a letter of the query string is modified, so the ranking is run very often. I will look into how easily I can launch a timer, and only search every ~0.5 seconds or so if the query string is updated.


Next week will probably be my last week of real development before the end of the semester. I forgot to factor in time for making my video and write-up when I was setting goals last week, so I will need to reserve the last few days for that. On the plus side though, next week is reading days, so I will have more time to code (in theory at least). I will aim to clean up the nutrition history visualization page. Hopefully I can get meals to work, but I need to vow not to waste too much time on it or nothing else will get done. The next thing after that will probably be hard-coding in a recommended daily values table to get the visualization information normalized properly. After that I will clean up the UI and clean up the code a little bit so it isn't too bad for Norm and Joe to review.

Sunday, December 4, 2011

Beta Review and The Last Three Weeks

I met with Norm and Joe for my Beta Review on Monday. I think that things went much better than the Alpha Review, and I am much happier with my progress. Somethings that Norm and Joe said that would like to see was better ease of use. Norm though it would be good improve the food search functionality and improve the results possibly using something similar to page rank and maybe caching results. It would also be good to improve the ordering of the nutrient listings to better reflect what users want.

My goals for the remainder of the semester are as follows:

Next Week (12/5 - 15/11)

Reading Week (12/12 - 12/18)
  • Speed up food search using some kind of ranking algorithmic
    • If the techniques described in the above two articles are impractical given time constraints, I will fall back to just sorting by the percentage of the word the search string matches.
  • Sort nutrient information based on what is most important
Finals Week (12/19 - 12/21)
  • If search needs refinement, I will spend the rest on my time refining the algorithm
  • If search is well done, I will instead work on adding another, feature perhaps specifying a personal diet goal, or adding in a preview view before permanently adding in a food item.

Friday, December 2, 2011

My Current Status Report and Self Evaluation

This is basically a post analyzing my current status on my project and evaluating how successfully I accomplished my goals. I think it is best to start out listing my original goals and highlight the status of each feature. Hopefully by categorizing the goals also I can get a better idea of what I did well, and what still is lacking.

X = Completed (No major additions need)
x = Mostly Complete (Functional, but may need small tweaks/additions)
/ = In Progress (Started, but not really functional for practical use)
  = Not Started
- = Originally Planned but Postponed
* = Future feature never planed in timeline


User Interface
[x] Add Food Page (still needs option for number of servings)
[X] Page Displaying Day's Intake of Food
[X] Page Displaying Total Nutrition For Day
[*] Nutrition Facts Page
[*] Preview Page (preview effect of food on day's nutrition)
[-] Barcode Scanning
[*] Voice Input


Database Management
[X] Copy Over Consumed Information to Local DB
[x] Add meals to local DB (need to set dates and servings)
[-] Scrape in Food Product Information
[*] Setup on Server for Main DB


Usability Features
[/] Food Lookup
[ ] Food Search Personalization (Give your common meals higher priority in search)
[ ] Filter preferred nutrients (Only visualized specified nutrients)
[-] UPC Lookup


Analysis Features
[/] Nutrition Visualization (have the nutrition calculated, but still need improvements in presentation order)
[ ] Personalized Goals


In retrospect, I think that I am in decent shape. I did underestimate how much time it would take to get the database and the basic framework up and running. This was definitely the biggest bottleneck during the early stages of development. In order to get caught up, I had to cut out UPC Lookup and Database scraping. My main reasoning for cutting them out was because they would have required even more work hacking two different databases to combine together. I did not give enough time for either of these, and so I think that it was the right decision to postpone them. Otherwise, I may have had a database with a wealth of data, but no functionality to show for it.


After my Alpha Review, the general consensus was that I didn't have enough to show for my work. I then decided that I needed to refocus on some immediate usable features. I refocused my efforts on hammering out the UI, and getting actual data interaction. I think that this was a very beneficial decision. I now have something I can actually interact with and test that shows concrete results. Even though some of the implementation is not finished, I have a much better idea of what works, and what parts of my initial assumptions were correct and which were wrong.


There were times during development that I think I lost focus of what I wanted to do. I set out with the mind set that this would be a project that I would continue even after the semester ends (which I still) do. However, I think that clouded my thoughts when it came to making decisions on what need to be done, because I got caught up on my long term goals, and didn't spend enough time implementing the short term goals. In terms of my project's full life time, the distribution of my efforts was probably more reasonable. However, my early approach was not the best for getting a working prototype up in just one semester.


I am pretty confident with my current direction now, and I think I have a good focus for the remaining three weeks. I will go into more detail about the specific features that I have chosen in my next post.

Sunday, November 20, 2011

Setting Up the Local User Database

I've finally set up the user database for the app. Now you can chose the food that you have eaten, and it will actually be stored (though in the emulator the saved data only lasts for the duration that the emulator is open [I can still debug multiple times though]). I still haven't setup the food to get added in with a date, so the today's meals pivot lists all food that has been entered into the database right now. Adding in multiple dates should be trivial (just tedious). I will just need to make a change to the query that searches for any meal that matches the current date.



This week I'm going to work out the UPC (barcode) lookup. I am going to need to scrape a webpage into my database which hopefully shouldnt take too long. Then I will use a barcode scanner library for reading. I will probably not have a UI for this by the end of the week, but hopefully I can at least be able to lookup a product by its UPC.

Friday, November 11, 2011

UI Design Implementation

I spent this past week working on getting concrete implementations of the UI, to show what the design direction will look like. There are still two more pages that I need to layout. One is the nutrition information pane, which will require a good plan for formatting all the non-standard nutrition data (ie. the nutrients that aren't on every nutrition label but are on some). The other is a nutrition preview page. This was suggested by Eliot, that I should have a way to preview how a particular food item will affect the days nutritional intake. This will likely look similar to the "today's nutrition" page shown below, but will need to have a way to take into account that some of the information is temporary. I would like for that to look like a stacked bar graph, such that at the end of each bar showing the nutrition percentage, you will also see the portion the corresponds to the planned food item shown in a different color. Right now, I have the bars implemented as progress bars, so I will need to come up with my own implementation most likely from scratch in order to get this to work.

This coming week, I plan on getting the food input added in. This will mean that I will need to be able to look up a food item from the database, and add it into a database log with all of the days food. I will also aim to have the today's nutrition, and today's meal pages accurate portray what you have eaten today.




Friday, November 4, 2011

Database Structure

After much pain in trying to find one database good enough to fit my needs, I have given in and decided to work with three databases together.

The first database is the USDA database that I have already been using. The structure for this database will dictate how the rest of the databases that get added on will be formed. There is one problem with the USDA database. Unfortunately for some reason it does not have data on many brand name products. I thought it was surprising that there is no government database that catalogs brand name product nutritional information, but I will have to make do in other ways.

Instead I will turn to the information for www.nutritional-information.info. The database on the site has a much better selection of brand name and restaurant food items. Unfortunately, I noticed that some of the data is not completely accurate (just by comparing values of food that I have lying around). However, the values aren't terribly off, and for now it provides me with a large set of data to develop off of. When I reach the point that I want to actually publish this app, I can look into licensing a non-free database with hopefully more accurate information. There isn't a direct database access feature to this database, so I can't use the site as is. Instead I am going to write a script to scrape all of the information of the site and merge it into the USDA database.

Next I will need a way to look up products by barcode. Unfortunately neither of these two databases have and reference to UPC codes, so I will need yet another reference for this. There are a number of UPC lookup websites (I haven't chosen a particular one yet, I will need to see which one has the most consistent lookup success).  The websites really only provide a product name, so the best I can do is then query the database using the product name as a search string. Hopefully this should generally provide good results.

After looking at the competing nutrition apps, it seems like the full coverage that they have gotten is through crowd-sourcing their data. Essentially, in order to fill in the gaps of the database, I can add the option for users to manually enter any food that isn't already in the database. Sure a lot of users won't do this, but even a small number of contributors should greatly improved the breadth of the database.

For what it is worth, Norm also forwarded my information about a Harvard project which crowd-sources nutrition estimates http://www.seas.harvard.edu/news-events/press-releases/crowdsourcing-nutrition-in-a-snap. Essentially, you can just take a picture of your meal and send it out to the crowd and get back an estimate of what the nutritional values are.

Alpha Review Feedback


I've gotten some feedback based on my Alpha Review presentation last week, and thought I would take some time to address and respond to some of the comments.

The biggest concern was that I wasn't very far into implementing my project, and was still stuck in the concept phase. I agree somewhat, I don't have nearly as much implemented as I had wanted. I had a lot of trouble finding the right database, and then getting it to integrate into the app. I'll go into a little more detail about what my final choices for databases will be in my next post. I found that coupled with a very busy early semester schedule, learning both Windows Phone development and database management at the same time was taxing on how much actual results I could put out. At this point, I think I am just getting past that initial hump, and so it should be full steam ahead as things should start falling into place.

Many people had requested that I include UPC barcode support into the app for inputting food items. In theory this is a great idea. I'm not entirely certain that this will actually improve ease of input given the current set of data I have right now. The main issue is that there is no good (free) database that links UPC codes to nutritional information. Essentially the solution at the moment will have to be looking up the product in an additional database, then using the product name to search for nutritional information. I will go into more detail about the approach I plan to take in my next post.

There is also the issue, that API support for the native Windows Phone barcode scanner still has yet to be released, so I will need to use some other library to read the bardcodes. I found two libraries, one is an old .NET barcode scanner, and the other is a recent C# port of the ZebraXing barcode scanner (which is used in the Android barcode app).

Other Notable Comments/Suggestions:

What is the target audience, and what is the competition?

I think its hard to really define my target as anything other than: people who want to track their diets using their phone. Most of the app competition, which I believe I briefly outline in one of my early posts, either only track calorie counts or just give analysis of individual foods without taking into account the rest of the days food intake. How my app will differ is that it will track ALL nutritional information. I hope to break the preconceived notion that calories are all that matters. The problem in the past seems to have been that there is just way too many nutrients to keep track of and so people fall back to just calories because it is one easy to understand number. However, we've come well past the point where technology can easily organize all of our nutritional intake for us. Instead of having to add up every nutrient yourself, you will get all of the counts automatically and hopefully get a better understanding of how your diet actually works. I will have another post  this weekend showcasing the planned UI which should better demonstrate how I will make it easier to follow all of the nutrients without being overwhelmed.

Perform OCR on the nutrition facts label

This was actually one of my early feature ideas, so I agree that it would be good to have (especially after seeing the poor showing for free access food product databases). However, since nutrition facts labels might not be laid out in the exact same pattern on every package, the parsing of the label risks being very complicated. While this is a great potential feature, I can foresee this as being something that could easily be a semester's worth of work on its own to get it to work reliably.

Create a webapp to view your information away from your phone.

This would be great to have. Eventually I will need to migrate the majority of the database to a server. However, setting this all up will likely take time that I would rather spend on other features given my limited amount of time. I'll try to slate this into my planning for next semester though.

I will be making a few extra posts this weekend. I think I have been a little hand wavy in my past posts, so I'll try to take out this weekend to demonstrate my goals in a more concrete fashion.