Welcome to RS_c, the central platform for the RecSys community. We provide curated lists of recommender-systems datasets, algorithms, books, conferences and many resources more. Maybe most importantly, we publish the latest recommender-system news. If you want your news to be reported on RS_c, read here.
In this blog we will walk you through the NVTabular workflow steps in an example where we use ~1.3TB Criteo dataset shared by CriteoLabs for the predicting ad click-through rate (CTR) Kaggle challenge. We’ll show you how to use NVTabular as a preprocessing library to prepare the Criteo dataset on a single V100 32 GB GPU. The large memory footprint of this dataset presents an excellent opportunity to highlight the advantages of the online fashion in which NVTabular loads and transforms data much larger than what fits into available GPU memory.