I am presently developing a credit card applicatoin to permit students to handle their courses, and that i don't really understand how to design the database for any specific feature. The customer wants, nearly the same as Facebook, that after students shows the list of subscribers presently inside a specific course, the folks most abundant in mutual courses using the drenched in user are displayed first. Nearly as good as Facebook feature "Friend suggestions" by having an additional filter.

Being an additional feature, I must give a search feature to permit students to find a different one, and exhibiting first within the search engine results the folks with many mutual courses using the drenched in user.

I presently use MySQL, I intend to use Cassandra for many additional features, i use Memcached for result caching and Sphinx for that search.



The application is coded in Python, BTW

And That I didn't remember to say the standard approach (utilizing a nice MySQL query to calculate all of this by having an ORDER BY clause) is wayyyys not fast enough. In order reads are much more frequent than reads, I'd like the majority of the logic to occur once, once the relation people <-> course is added.

I figured about upgrading a "mutual courses" counter specific to 1 tuple (user, course) that'll be elevated for those customers of the course once the drenched in user joins a brand new course (or decreased as he leaves it).

If you have a table that's named Users and also the Primary Secret is UserID. Then you've a table known as Friends with 2 posts known as UserID (PK) and FriendUserID.

If you have 2 customers, 20 and 50.

When 20 adds 50 as friend, the applying adds a brand new row:

INSERT INTO `Friends` (`UserID`, `FriendUserID`) VALUES (20, 50)

so when 50 verifies friendship, you add another row with values switched:

INSERT INTO `Friends` (`UserID`, `FriendUserID`) VALUES (50, 20)

When you wish to locate mutual buddies between 20 and 50, simply:

SELECT `UserID` FROM `Friends` AS `A`, `Friends` AS B WHERE `A`.`FriendUserID` = 20 AND `A`.`UserID` = `B`.`UserID` AND `B`.`FriendUserID` = 50

If you have your solution, the main problem is only the speed of this query, try doing the work sooner. Whenever a user's relationships change, rerun employment that computes this stuff and store all of the results away. Don't runt his consequently of the request, when you really need the end result so rapidly. Do such costly things only one time and do them before a request is available.

I'd break this as (2) queries and discover the intersection in Python:

#Query 1 - Get the user's friends
SELECT friend_id FROM friends WHERE user_id = 'my user id'

#Query 2 - Get the users enrolled in the course
SELECT student_id FROM course_enrollment WHERE course_id = 'course id'

Then look for the intersection in Python. You'll be able to allow the database do caching, and so on.. with no joins to slow things lower.