Hein (fub) wrote,
Hein
fub

  • Mood:

LiveJournal matching and clustering

OK, so I did some more work on the LJ 'find-new-friends-who-share-your-interests' thing. I programmed something to get the friends of a user, and their interests. To cluster them meaningfully, I need to calculate the number of interests they have in common. Here's my first run:
User 1User 2Overlap in interests
arnoudensshironuchan35
luna_puellasol_nuada24
luna_puellashironuchan21
arnoudensluna_puella19
fubarnoudens18
shironuchansol_nuada17
fubshironuchan14
damaniqueshironuchan14
fubthorkell13
arnoudenssol_nuada12
fubdamanique12
thorkellsol_nuada12
damaniqueisabelgou12
green_obsessionluna_puella12
fubchadu11
fubsol_nuada11
isabelgouluna_puella10
thorkellarnoudens10
arnoudensdamanique10
isabelgousol_nuada9
thorkellchadu9
arnoudenschadu9
isabelgoushironuchan9
luna_puellasuckerrlove9
chadusol_nuada9
thorkellshironuchan8
arnoudensgreen_obsession8
damaniqueluna_puella8
sol_nuadasuckerrlove8
velveteencatdamanique8
green_obsessionshironuchan8
green_obsessionsuckerrlove8
fubluna_puella8
shironuchansuckerrlove8
fubcthani8
damaniquesol_nuada8
fubgreen_obsession8
chadushironuchan7
arnoudenssuckerrlove7
thorkellluna_puella7
green_obsessionsol_nuada7
arnoudensisabelgou6
chaduluna_puella6
chadusuckerrlove6
thorkellcthani6
green_obsessionisabelgou6
damaniquegreen_obsession6
velveteencatisabelgou6
fubisabelgou6
damaniquesuckerrlove5
thorkellgreen_obsession5
velveteencatluna_puella5
arnoudenscthani4
velveteencatgreen_obsession4
thorkellisabelgou4
thorkelldamanique4
cthanidamanique4
cthaniisabelgou4
thorkellsuckerrlove4
cthanisol_nuada4
chadudamanique4
fubrvdammit4
smurfkillersuckerrlove4
cthanishironuchan4
thorkellrvdammit3
fubsuckerrlove3
cthanigreen_obsession3
chadugreen_obsession3
cthaniluna_puella3
velveteencatsol_nuada3
chaducthani3
chaduisabelgou3
cthanirvdammit3
fubvelveteencat2
isabelgousuckerrlove2
suckerrlovethe_phenomenal2
arnoudenssmurfkiller2
velveteencatshironuchan2
velveteencatcthani2

Don't know yet what to make of it -- I need to normalise this against the number of interests, I think. Also, I need to be a lot more conservative with traffic: if everyone contributes 10 new users to the queue, then the first level is 10, the second is 100 etc. Before you know it, you'll be generating too much traffic. Perhaps I should add user's friends to the queue as their overlap increases, or something.
Also, if I create a cluster, I will still need to have the user-node available for clustering into another cluster (because theoretically, you can be a member of two or clusters with disjunct interests).

More thinking is needed.
Subscribe

  • Expanding my network

    I haven’t been posting here that much — partly because I don’t have much to say. But I haven’t been off the internet, of…

  • Things that happened this week

    A power interruption. We had gotten a letter from the company that manages the power lines that they’d be working on the infrastructure on…

  • The Forest Shrine released!

    In the early stages of 2020, I released my scenario The Secret of Cedar Peak, a scenario for fifth edition Dungeons & Dragons. I had a sequel…

  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 3 comments