Now i’m trying to implement a study project about people preferences in manga and i’m confused a little bit. For dataset i parsed one site and got the table, that looking like this.

+---------+----------+---------------------+-----------------+------------------+
| user_id | user_sex |   favorite_genres   | favorite_titles | abandoned_titles |
+---------+----------+---------------------+-----------------+------------------+
| int     |    str   | set(genre:n_titles) |       list      |       list       |
+---------+----------+---------------------+-----------------+------------------+

Goal of the project is to create recommendation system for offering people something that they have more chances to like, for that i need to somehow split the info about favorite genres, favorite and abandoned titles to work with them individually.

For project i use Python and it’s library Pandas and upper table stored as pd.DataFrame, all types that is not int have object type, and that leads for my struggle. I can’t decide how to make the design more convenient to work with. I thought about using SQL instead of Python, but in this language i have even less experience than in Python and there is the question.

Question.

Is there any online resource/book/etc that will help me to design a more convenient way to work with this kind of data? What do you personally can advice me to study for this kind of tasks? Any advice will be appreciated.

Thank you in advance!



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *