Intro | Participate | It's a cause
|
Requirements |
Programming | Upload/Download |
Miscellaneous | Links
Uploading, downloading, and synchronization
Uploading and downloading content and
data
- Downloading your content and data:
- An API is best for downloading: It's
easiest for everybody involved when websites provide an API/interface
for uploading and downloading content and data. Many commercial
sites make it easy to upload content, but hard to download that same
content. We
plan to encourage more sites to provide APIs for downloading your
content and the data associated with that content.
- Screen scraping: If sites don't give us
an API for downloading the content and data associated with that
content, we'll get it one way or another. This is more work and
will take some clever programming, but it's still fully doable. We
have an entire world of smart programmers who want their data and can
help with this task.
The wikipedia
entry for screen scraping.
- Uploading your content and data:
- An API is best for uploading: Again,
it's easier to upload to a site if they provide an API/interface for
us to do this. But if they don't give us an API, there are still
ways to upload content. Again note that many commercial sites do provide APIs for
uploading content.
- Uploading automatically through web forms:
We can write programs that simulate a user who is entering data through
a web form. If a site doesn't give us an API, we will be forced to
use this method.
- Keeping the sites clean and preserving the
flow: Our goal is to help the sites we put our content onto.
When uploading content and data, we must do it with care. We don't
want to accidentally make it so that sites feel crappy because of data
or content that's been loaded into the wrong place or duplicated.
If done wrong, this took could make social sites feel more disjunctive
and confused. We want to handle content uploads with care.
- Example APIs and libraries: We will
provide sample interfaces and pre-built libraries that sites can use when
they want to add upload and download APIs to their site.
- RSS and Atom: RSS feeds and
Atom can both
be used in a similar fashion to an API for a site. To put it more
simply, these are ways of extracting information from a site in a format
that's easy for a machine to read.
User options and features
- Selecting what to download/upload:
Users will be able to select which content/data they want to download/upload. A user
may wish to download all the tags, but not the photos. A user may want to download their profile information, in order to upload it
to another site. Or a user may want to download all of their content on
a site.
- Incremental uploads/downloads: A user may want to
upload/download all the content to/from a given site. Later on they may
want to update that content. Ideally they would not need to up/download
all of content. Rather they would just up/download the changed content.
- Setting up user accounts and options for each site:
A users setting for YouTube may be quite different for their settings for
LiveJournal. A user should be able to specify how each different site
should be treated. If the sites can be automatically accessed, then the
user account and password will need to be entered and stored.
- Upload content to many sites at once:
You can, for example, upload content onto many sites at once. An image or
set of images could be uploaded to flickr, tribe, and MySpace automatically.
A blog can be posted to several social networking sites at once.
Synchronize content between sites
- Some of the synchronization possibilities:
- From local machine to many sites:
Content can be put in a predetermined directory on your local machine
and then auto uploaded to many content sites.
- Upload your profile on MySpace and it auto
updates your profile on Tribe: You can make it so that updated
content on one site auto updates your content on other
sites.
Trickier synchronization issues
- Taking your testimonials with you: Say
that you
were on friendster and lots of people gave you rad testimonials.
Then you moved to Tribe or MySpace and lost those testimonials. What
is to be done?
- Make them into a comment: One options is
to turn all the testimonials from friendster into a single comment on
the new site. This isn't great, but it's likely better than
nothing.
- Resubmit on the new site: This is
getting tricky and still takes some manual effort, but the tool could
send a request to each of the folks who gave you a testimonial.
That request could make it easy for that person to submit the same
testimonial on the new site.
- Synchronizing the content between sites:
A user may want to continue to always load their content onto flickr. This tool should
ideally allow for that.
- Taking your friend network with you: This is a difficult problem.
A solution will be challenging to implement.
- The solution uses fuzzy logic to match users:
This system needs to be able to compare and match users on different
social networking systems. This comparison will need to be done
using fuzzy logic. The MoveMyData tool can scan your social
network of friends on a given site. It can gather key information
about your friends. It can then go to another site and try to
determine if there are other users with the same name, birthday, home
town, email, homepage/website, etc. It can also see which of these
users tend to fall into the same social circles. The main problems
with this solution is that it would required doing a lot of queries on
the social networking sites, the coding would be tricky, and there would
be inaccuracy. In the long run this is likely to be the solution
we use.
- An example: Our system is trying to
determine if "Albert Wiseman" from LinkedIn is the same user as "SuperAboy"
from MySpace. It will decide this based primarily on things like
these user accounts having the same birthday and same zip code.
Also, Albert Wiseman on LinkedIn is friends with Suzie Anton on LinkedIn.
Suzie lists her home page on both LinkedIn and MySpace. Based on
this, we're fairly sure that Suzie is the same user on both sites.
And again, because Suzie is friends Albert on LinkedIn, we have further
proof that "Albert Wiseman" and "SuperAboy" are the same user.
- Running queries: The MoveMyData tool may
run a query on MySpace in order search for users who are likely to be
"Albert Wiseman".
- Seeding the social network: If you are
on LinkedIn and want to move your social network to MySpace, this system
will work best if you first sign up for a few friends manually.
Based on having a few friend connections already, it will be more likely
to be able to determine who is who. Ieally, when making these
initial connections, become friends with people who know several other
key friends.
- Challanges:
- This will require many queries and
comparisons: This will require computers to do a bit of querying,
thinking, estimating, and computing. This will take time and
resources from the users machine and the social networking site.
- It will have inaccuracy due to the fuzzy
logic: This system will not be 100% accurate, but we should be
able to get a fair amount of accuracy.
- FOAF + fuzzy logic is the long term
solution: There is an excellent
FOAF (Friend of a friend)
project which is partly intended to allow users to move their friend
networks from one social networking system to another. Marc
Canter has some
interesting negative thoughs on FOAF.
- Adoption of FOAF: It requires that the
social networking site implement FOAF. Tribe.net has implemented
FOAF, but not enough other social networking sites have.
- Simulation of adoption of FOAF: If a
social networking site isn't playing nice, there are still things that
can be done. The information for FOAF can be screen scraped from a
social networking site for every user that we're checking. This
system would work, but could use a lot of resources from the social
networking site.
- Simulated, aggregated, stored FOAF: The
FOAF like information on users of MySpace (for example) could be screen
scraped and then aggregated to a database that others can use.
This would be a powerful tool for helping user to migrate their friends
to and from a given social networking site. On the other hand
there may be legal problems with aggregating such data. One
advantage of aggregated data is that it would eat less resources from
the social networking site. Of course each social networking site
implementing FOAF is the real solution we would want to push for.
- Problematic
ways to identify a user/friend:
- Using email isn't an ideal ID: Orkut (googles social networking system)
allows users to see their friends email addresses, but this is not
typical in social networking sites. If email is used as an
identifier, it will end up inadvertently falling into unwanted hands
too often.
- Usernames and names are not good IDs:
Usernames tend to be different across sites and on sites like
tribe.net a user can change the name that's displayed. A
persons actual legal name is often the same as other people on a
social networking site. Clearly this is not a good unique
identifier across sites.
This is being created in association with
