Hein (fub) wrote,
Hein
fub

  • Mood:

LiveJournal matching and clustering

OK, so I did some more work on the LJ 'find-new-friends-who-share-your-interests' thing. I programmed something to get the friends of a user, and their interests. To cluster them meaningfully, I need to calculate the number of interests they have in common. Here's my first run:
User 1User 2Overlap in interests
arnoudensshironuchan35
luna_puellasol_nuada24
luna_puellashironuchan21
arnoudensluna_puella19
fubarnoudens18
shironuchansol_nuada17
fubshironuchan14
damaniqueshironuchan14
fubthorkell13
arnoudenssol_nuada12
fubdamanique12
thorkellsol_nuada12
damaniqueisabelgou12
green_obsessionluna_puella12
fubchadu11
fubsol_nuada11
isabelgouluna_puella10
thorkellarnoudens10
arnoudensdamanique10
isabelgousol_nuada9
thorkellchadu9
arnoudenschadu9
isabelgoushironuchan9
luna_puellasuckerrlove9
chadusol_nuada9
thorkellshironuchan8
arnoudensgreen_obsession8
damaniqueluna_puella8
sol_nuadasuckerrlove8
velveteencatdamanique8
green_obsessionshironuchan8
green_obsessionsuckerrlove8
fubluna_puella8
shironuchansuckerrlove8
fubcthani8
damaniquesol_nuada8
fubgreen_obsession8
chadushironuchan7
arnoudenssuckerrlove7
thorkellluna_puella7
green_obsessionsol_nuada7
arnoudensisabelgou6
chaduluna_puella6
chadusuckerrlove6
thorkellcthani6
green_obsessionisabelgou6
damaniquegreen_obsession6
velveteencatisabelgou6
fubisabelgou6
damaniquesuckerrlove5
thorkellgreen_obsession5
velveteencatluna_puella5
arnoudenscthani4
velveteencatgreen_obsession4
thorkellisabelgou4
thorkelldamanique4
cthanidamanique4
cthaniisabelgou4
thorkellsuckerrlove4
cthanisol_nuada4
chadudamanique4
fubrvdammit4
smurfkillersuckerrlove4
cthanishironuchan4
thorkellrvdammit3
fubsuckerrlove3
cthanigreen_obsession3
chadugreen_obsession3
cthaniluna_puella3
velveteencatsol_nuada3
chaducthani3
chaduisabelgou3
cthanirvdammit3
fubvelveteencat2
isabelgousuckerrlove2
suckerrlovethe_phenomenal2
arnoudenssmurfkiller2
velveteencatshironuchan2
velveteencatcthani2

Don't know yet what to make of it -- I need to normalise this against the number of interests, I think. Also, I need to be a lot more conservative with traffic: if everyone contributes 10 new users to the queue, then the first level is 10, the second is 100 etc. Before you know it, you'll be generating too much traffic. Perhaps I should add user's friends to the queue as their overlap increases, or something.
Also, if I create a cluster, I will still need to have the user-node available for clustering into another cluster (because theoretically, you can be a member of two or clusters with disjunct interests).

More thinking is needed.
Subscribe

  • D&D cartoon narrated as an actual session

    Remember the old Dungeons & Dragons cartoon? I remember playing in Rolemaster campaigns that were just about as cheesy as that cartoon... So when…

  • Cat Rescue!

    This is very, very cool.

  • Shared for truth

    In my case, that's Sunday (second D&D group), Tuesday (The One Ring via Google Hangout), Wednesday (my Streamdales campaign), Thursday (Ars Magica)…

  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 3 comments