Varozhka: Introduction (part 1)
As you may know, Netflix organized a competition for systems predicting user ratings for movies.
I'm sure a lot of bright people have ideas on how to improve that, but do not have time to spend on it.
This project is a framework to automate most of the dirty work with the dataset. So you can concentrate on the prediction algorithm ;)
Current features:
- No additional DB engine is required. All indexes are loaded in memory.
- Abstract layer to play with data (this is a place to plug in).
- Data access layer.
- Easy way to check RMSE against the probe set.
- Generation of submission dataset.
So, basically you can download the Netflix dataset, extract it to a directory, start a wizard (which do all import tasks), implement your own rating estimator, and use a wizard to submit results to Netflix.
The project named Varozhka (belarusian word for "fortune-teller"). It hosted at Google Code and SourceForge.Net.
This is an introductory post about the project. More details later...
NOTE: The project is under development, and most of the code is not optimized in any way.
Labels: .net, netflix, netflix prize, netflixprize, varozhka
0 Comments:
Post a Comment
<< Home