2–6 Mar 2009
Le Ciminiere, Catania, Sicily, Italy
Europe/Rome timezone

Using Grids to support Recommender Systems: porting Collaborative Filtering recommendations on gLite

3 Mar 2009, 16:24
12m
Foyer (Le Ciminiere, Catania, Sicily, Italy)

Foyer

Le Ciminiere, Catania, Sicily, Italy

Viale Africa 95100 Catania
Demo Experiences from application porting and deployment Demo Session

Speaker

Mr Leandro Ciuffo (Istituto Nazionale di Fisica Nucleare)

Description

Recommender systems (RS) are best known for their use in e-commerce websites to provide a list of tailored items to the customers. However, such systems requires computations that grow polynomially with the number of users and products. For a large retailer like Amazon.com, with tens of millions of customers and millions of catalog items, generate recommendations requires a big computing power. The proposed work aims at presenting a "gridified" implementation of a classic RS algorithm.

URL for further information

http://canalcinefilia.com.br

Impact

Grids first emerged within scientific communities, like HEP experiments. However, the enormous research activity in recent years has contributed to the development of new areas of interest. Commercial users have been attracted by this technology, which can potentially be exploited by SMEs to offer new services with reduced costs and higher performance.
Hence, the impact of the proposed work is twofold: (i) to showcase GRelC and how applications can interact with on-line databases when running on gLite; (ii) to demonstrate (mainly to SMEs) the benefits of using grids to run CRM tools such as Recommender Systems.

Detailed analysis

To perform this case study we developed a RS to recommend movies at Cinefilia website (www.canalcinefilia.com.br). Our current dataset consist of more than 830 movies which received 32,817 ratings provided by 327 unique users. Users can freely interact with the Cinefilia website to rate as much movies as they can. Cinefilia makes use of a standard MySQL database to store and retrieve information about its users. Our application is currently deployed on GILDA and uses GRelC to interface with the MySQL database. In our implementation approach, we are using parametric Jobs where each user ID is a parameter of the actual executable file called "recommender". This is a compiled code originally written in C that implements the classic CF algorithm and calculates recommendations for a single user – specified as a parameter. To keep track of all executions of our Recommender System, a post-processing script is in charge of storing some relevant statistics into the AMGA Metadata Catalogue.

Conclusions and Future Work

The distributed approach used in this case study has helped to reduce the complexity of the Collaborative Filtering alogorithm from O(m2n) to O(mn).
Also, this work has produced a free on-line service that can be used to advertise Grids and GRelC. As a future work, we intend to deploy our application in the EELA-2 production Grid infrastructure.

Justification for delivering demo and technical requirements (ONLY for demonstrations)

This demo will interact with any participant willing to get movie recommendations.
Visitors will be invited to rate movies on the Cinefilia website. Then the application to generate recommendations will be launched on GILDA. Once the job execution is completed (it takes a couple of minutes for a single user), the user will be able to get his/her recommendations on-line. The only requirement is Internet connection to either access the Cinefilia website or the GILDA UI via SSH.

Keywords

Recommender Systems, Personalization, Collaborative Filtering, CRM, GRelC, GILDA

Author

Mr Leandro Ciuffo (Istituto Nazionale di Fisica Nucleare)

Presentation materials

There are no materials yet.