nullch9 P2P systemch9 P2P systemDefintion of P2PDefintion of P2P1) Significant autonomy from central servers
2) Exploits resources at the edges of the Internet
○ storage and content
○ CPU cycles
○ human presence
3) Resources at edge have intermittent connectivity, being added & removed
It’s a broad definitionIt’s a broad definition□ P2P file sharing
○ Napster, Gnutella, KaZaA, eDonkey, etc
□ P2P communication
○ Instant messaging
○ Voice-over-IP: Skype
□ P2P computation
○ seti@home
□ DHTs & their apps
○ Chord, CAN, Pastry, Tapestry
□ P2P apps built over
○ emerging overlays
PlanetLab
Wireless ad-hoc networking not covered hereTutorial Outline (1)Tutorial Outline (1)□ 1. Overview: overlay networks, P2P applications, copyright issues, worldwide computer vision
□ 2. Unstructured P2P file sharing: Napster, Gnutella, KaZaA, search theory, flashfloods
□ 3. Structured DHT systems: Chord, CAN, Pastry, Tapestry, etc.1. Overview of P2P1. Overview of P2P□ overlay networks
□ P2P applications
□ worldwide computer visionOverlay networksOverlay networksOverlay graphOverlay graphVirtual edge
□ TCP connection
□ or simply a pointer to an IP address
Overlay maintenance
□ Periodically ping to make sure neighbor is still alive
□ Or verify liveness while messaging
□ If neighbor goes down, may want to establish new edge
□ New node needs to bootstrapMore about overlaysMore about overlaysUnstructured overlays
□ e.g., new node randomly chooses three existing nodes as neighbors
Structured overlays
□ e.g., edges arranged in restrictive structure
Proximity
□ Not necessarily taken into accountOverlays: all in the application layerOverlays: all in the application layerTremendous design flexibility
○ Topology, maintenance
○ Message types
○ Protocol
○ Messaging over TCP or UDP
Underlying physical net is transparent to developer
○ But some overlays exploitExamples of overlaysExamples of overlays□ DNS
□ BGP routers and their peering relationships
□ Content distribution networks (CDNs)
□ Application-level multicast
○ economical way around barriers to IP multicast
□ And P2P apps !Overview of P2POverview of P2P□ overlay networks
□ current P2P applications
○ P2P file sharing & copyright issues
○ Instant messaging / voice over IP
○ P2P distributed computing
□ worldwide computer visionP2P file sharingP2P file sharing□ Alice runs P2P client
application on her
notebook computer
□ Intermittently
connects to Internet;
gets new IP address for each connection
□ Registers her
content in P2P system□ Asks for “Hey Jude”
□ Application displays
other peers that have
copy of Hey Jude.
□ Alice chooses one of
the peers, Bob.
□ File is copied from
Bob’s PC to Alice’s
notebook: P2P
□ While Alice downloads,
other users uploading
from Alice.Millions of content serversMillions of content serversKiller deploymentsKiller deployments□ Napster
○ disruptive; proof of concept
□ Gnutella
○ open source
□ KaZaA/FastTrack
○ Today more KaZaA traffic then Web traffic!
□ eDonkey / Overnet
○ Becoming popular in Europe
○ Appears to use a DHTIs success due to massive number of servers, or simply because content is free?P2P file sharing softwareP2P file sharing software□ Allows Alice to open up
a directory in her file
system
○ Anyone can
retrieve a file from
directory
○ Like a Web server
□ Allows Alice to copy files from other users’ open directories:
○ Like a Web client□ Allows users to search
nodes for content
based on keyword
matches:
○ Like Google
Seems harmless to me !Copyright issues (1)Copyright issues (1)Direct infringement:
□ end users who download or upload copyrighted works
Indirect infringement:
□ Hold an individual accountable for actions of others
□ Contributory
□ Vicariousdirect infringersCopyright issues (2)Copyright issues (2)Contributory infringer:
□ knew of underlying direct infringement, and
□ caused, induced, or materially contributed to direct infringementVicarious infringer:
□ able to control the direct infringers (e.g., terminate user accounts), and
□ derived direct financial benefit from direct infringement (money, more users)
(knowledge not necessary)Copyright issues (2)Copyright issues (2)Betamax VCR defense
□ Manufacturer not liable for contributory infringement
□ “capable of substantial non-infringing use”
□ But in Napster case, court found defense does not apply to all vicarious liability Guidelines for P2P developers
□ total control so that there’s no direct infringement Or
□ no control over users – no remote kill switch, automatic updates, actively promote non-infringing uses of product
□ Disaggregate functions:
indexing, search, transfer
□ No customer support Instant MessagingInstant Messaging□ Alice runs IM client on
her PC
□ Intermittently connects to Internet; gets new IP address for each connection
□ Registers herself with
“system”
□ Learns from “system” that Bob in her buddy list is active□ Alice initiates direct
TCP connection with
Bob: P2P
□ Alice and Bob chat.
□ Can also be voice, video and text.We’ll see that Skype is a VoIP P2P systemP2P Distributed ComputingP2P Distributed Computingseti@home
□ Search for ET
intelligence
□ Central site collects
radio telescope data
□ Data is divided into
work chunks of 300
Kbytes
□ User obtains client,
which runs in backgrd□ Peer sets up TCP
connection to central
computer, downloads
chunk
□ Peer does FFT on chunk,
uploads results, gets
new chunkNot peer to peer, but exploits resources at network edgeWorldwide Computer VisionWorldwide Computer Vision□ overlay networks
□ P2P applications
□ worldwide computer vision
Worldwide Computer VisionWorldwide Computer VisionAlice’s home computer:
□ Working for biotech,
matching gene sequences
□ DSL connection
downloading telescope data
□ Contains encrypted
fragments of thousands of
non-Alice files
□ Occasionally a fragment is
read; it’s part of a movie
someone is watching in Paris
□ Her laptop is off, but it’s
backing up others’ files□ Alice’s computer is
moonlighting
□ Payments come from biotech
company, movie system and
backup serviceYour PC is only a component in the “big” computerWorldwide Computer (2)Worldwide Computer (2)Anderson & Kubiatowicz:
Internet-scale OS
□ Thin software layer
running on each host &
central coordinating
system running on ISOS
server complex
□ allocating resources,
coordinating currency
transfer
□ Supports data processing
& online services Challenges
□ heterogeneous hosts
□ security
□ payments
Central server complex
□ needed to ensure privacy of
sensitive data
□ ISOS server complex
maintains databases of
resource descriptions, usage
policies, and task descriptions2. Unstructured P2P File Sharing2. Unstructured P2P File Sharing□ Napster
□ Gnutella
□ KaZaA
□ BitTorrent
□ search theory
□ dealing with flash crowds NapsterNapster□ Paradigm shift
□ not the first (c.f. probably Eternity, from
Ross Anderson in Cambridge)
□ but instructive for what it gets right, and
□ also wrong…
□ also had a political message…and economic
and legal…NapsterNapster□ program for sharing files over the Internet
□ a “disruptive” application/technology?
□ history:
○ 5/99: Shawn Fanning (freshman, Northeasten U.)
founds Napster Online music service
○ 12/99: first lawsuit
○ 3/00: 25% UWisc traffic Napster
○ 2/01: US Circuit Court of Appeals: Napster knew users violating copyright laws
○ 7/01: # simultaneous online users: Napster 160K, Gnutella: 40K, Morpheus (KaZaA): 300KNapsterNapster□ judge orders Napster
to pull plug in July ’01
□ other file sharing apps
take over!Napster:how did it workNapster:how did it work□ Application-level, client-server protocol over
point-to-point TCP
□ Centralized directory server
Steps:
□ connect to Napster server
□ upload your list of files to server.
□ give server keywords to search the full list with.
□ select “best” of correct answers. (pings)NapsterNapster1. File list
and IP address is uploadednapster.com
centralized directoryNapsterNapster2. User requests search at server.napster.com
centralized directoryNapsterNapster3. User pings hosts that apparently have data.
Looks for
best transfer
rate.napster.com
centralized directoryNapsterNapster4. User chooses server napster.com
centralized directoryNapster’s
centralized server farm had difficult time keeping
up with traffic2. Unstructured P2P File Sharing2. Unstructured P2P File Sharing□ Napster
□ Gnutella
□ KaZaA
□ BitTorrent
□ search theory
□ dealing with flash crowds Distributed Search/FloodingDistributed Search/FloodingDistributed Search/FloodingDistributed Search/FloodingGnutellaGnutella□ Gnutella history:
○ 3/14/00: release by AOL, almost immediately
withdrawn
○ became open source
○ many iterations to fix poor initial design (poor
design turned many people off)
□ issues:
○ how much traffic does one query generate?
○ how many hosts can it support at once?
○ what is the latency associated with querying?
○ is there a bottleneck?Gnutella: limited scope queryGnutella: limited scope querySearching by flooding:
□ if you don’t have the file you want, query 7 of your neighbors.
□ if they don’t have it, they contact 7 of their neighbors, for a maximum hop count of 10.
□ reverse path forwarding for responses (not files)Note: Play gnutella animation at:
http://www.limewire.com/index.jsp/p2pGnutella overlay managementGnutella overlay management□ New node uses bootstrap node to get IP
addresses of existing Gnutella nodes
□ New node establishes neighboring relations by
sending join messagesGnutella in practiceGnutella in practice□ Gnutella traffic << KaZaA traffic
□ 16-year-old daughter said “it stinks”
○ Couldn’t find anything
○ Downloads wouldn’t complete
□ Fixes: do things KaZaA is doing: hierarchy, queue management, parallel download,…Gnutella Discussion:Gnutella Discussion:□ researchers like it because it’s open source
○ but is it truly representative?
□ architectural lessons learned?
□ More details in Kurose and Ross, 3rd edition2. Unstructured P2P File Sharing2. Unstructured P2P File Sharing□ Napster
□ Gnutella
□ KaZaA
□ BitTorrent
□ search theory
□ dealing with flash crowds KaZaA: The serviceKaZaA: The service□ more than 3 million up peers sharing over
3,000 terabytes of content
□ more popular than Napster ever was
□ more than 50% of Internet traffic ?
□ MP3s & entire albums, videos, games
□ optional parallel downloading of files
□ automatically switches to new download
server when current server becomes
unavailable
□ provides estimated download timesKaZaA: The service (2)KaZaA: The service (2)□ User can configure max number of simultaneous
uploads and max number of simultaneous downloads
□ queue management at server and client
○ Frequent uploaders can get priority in server queue
□ Keyword search
○ User can configure “up to x” responses to keywords
□ Responses to keyword queries come in waves;
stops when x responses are found
□ From user’s perspective, service resembles Google, but provides links to MP3s and videos rather than Web pages KaZaA: TechnologyKaZaA: TechnologySoftware
□ Proprietary
□ control data encrypted
□ Everything in HTTP request and response
messages
Architecture
□ hierarchical
□ cross between Napster and GnutellaKaZaA: ArchitectureKaZaA: Architecture□ Each peer is either a supernode or is assigned to a supernode
○ 56 min avg connect
○ Each SN has about
100-150 children
○ Roughly 30,000 SNs
□ Each supernode has
TCP connections with
30-50 supernodes
○ 0.1% connectivity
○ 23 min avg connectMeasurement studyMeasurement study●Tap point SN-SN session
○Tap point ON-SN sessionEvolution of connections at SNEvolution of connections at SNKaZaA: Architecture (2)KaZaA: Architecture (2)□ Nodes that have more connection bandwidth and are more available are designated as supernodes
□ Each supernode acts as a mini-Napster hub, tracking the content and IP addresses of
its descendants
□ Does a KaZaA SN track only the content of its children, or does it also track the content under its neighboring SNs?
○ Testing indicates only children.KaZaA metadataKaZaA metadata□ When ON connects to SN, it uploads its metadata.
□ For each file:
○ File name
○ File size
○ Content Hash
○ File descriptors: used for keyword matches during query
□ Content Hash:
○ When peer A selects file at peer B, peer A sends
ContentHash in HTTP request
○ If download for a specific file fails (partially completes),
ContentHash is used to search for new copy of file.KaZaA: Overlay maintenanceKaZaA: Overlay maintenance□ List of potential supernodes included within
software download
□ New peer goes through list until it finds
operational supernode
○ Connects, obtains more up-to-date list, with
200 entries
○ Nodes in list are “close” to ON.
○ Node then pings 5 nodes on list and connects with
the one
□ If supernode goes down, node obtains updated list and chooses new supernodeKaZaA QueriesKaZaA Queries□ Node first sends query to supernode
○ Supernode responds with matches
○ If x matches found, done.
□ Otherwise, supernode forwards query to
subset of supernodes
○ If total of x matches found, done.
□ Otherwise, query further forwarded
○ Probably by original supernode rather than recursivelyKazaa-liteKazaa-lite□ Hacked version of the KaZaA client
□ No spyware; no pop-up windows
□ Everyone is rated as a priority user
□ Supernode hopping
○ After receiving replies from SN, ON often
connects to new SN and re-sends query
○ SN does not cache hopped-out ON’s metadataParallel Downloading; RecoveryParallel Downloading; Recovery□ If file is found in multiple nodes, user can
select parallel downloading
○ Identical copies identified by ContentHash
□ HTTP byte-range header used to request
different portions of the file from different
nodes
□ Automatic recovery when server peer stops
sending file
○ ContentHashKaZaA Corporate StructureKaZaA Corporate Structure□ Software developed by
Estonians
□ FastTrack originally
incorporated in Amsterdam
□ FastTrack also deploys
KaZaA service
□ FastTrack licenses
software to Music City
(Morpheus) and Grokster
□ Later, FastTrack
terminates license, leaves
only KaZaA with killer
service□ Summer 2001, Sharman
networks, founded in
Vanuatu (small island in
Pacific), acquires
FastTrack
Board of directors,
investors: secret
□ Employees spread
around, hard to locateLessons learned from KaZaALessons learned from KaZaAKaZaA provides powerful file search and transfer
service without server infrastructure
Exploit heterogeneity
Provide automatic recovery for interrupted downloads
Powerful, intuitive user interfacenullMeasurement studies by Gribble et al 2002 U. Wash campus
Study
P2P: 43%; Web: 14%
Kazaa objects fetched
at most once per client
Popularity distribution
deviates substantially
from Zipf distribution
Flat for 100 most
popular objects
Popularity of objects
is short.KaZaA users are patient
Small objects (<10MB):
30% take more than
hour to download
Large objects (>100MB):
50% more than 1 day
Kazaa is a batch-mode
system, downloads done
in backgroundPollution in P2PPollution in P2PRecord labels hire “polluting companies” to
put bogus versions of popular songs in file
sharing systemsPolluting company maintains hundreds of
nodes with high bandwidth connections
User A downloads polluted fileUser B may download polluted file before A
removes itHow extensive is pollution today?
Anti-pollution mechanisms?2. Unstructured P2P File Sharing
2. Unstructured P2P File Sharing
Napster
Gnutella
KaZaABitTorrentsearch theorydealing with flash crowdsBitTorrentBitTorrentBitTorrent: PiecesBitTorrent: Pieces□ File is broken into piecesTypically piece is 256 KBytesUpload pieces while downloading pieces□ Piece selectionSelect rarest pieceExcept at beginning, select random pieces□ Tit-for-tatBit-torrent uploads to at most four peers
Among the uploaders, upload to the four that
are downloading to you at the highest rates
A little randomness too, for probingNATsNATs□ nemesis for P2P□ Peer behind NAT can’t be a TCP server
□ Partial solution: reverse callSuppose A wants to download from B, B behind NAT
Suppose A and B have each maintain TCP connection to
server C (not behind NAT)A can then ask B, through C, to set up a TCP connection
from B to A.A can then send query over this TCP connection, and B
can return the file□ What if both A and B are behind NATs?2. Unstructured P2P File Sharing2. Unstructured P2P File Sharing□ Napster
□ Gnutella
□ KaZaA□ search theory□ dealing with flash crowdsModeling Unstructured P2P
NetworksModeling Unstructured P2P
Networks□ In comparison to DHT-based searches,
unstructured searches aresimple to buildsimple to understand algorithmically□ Little concrete is known about their performance
□ Q: what is the expected overhead of a search?
□ Q: how does caching pointers help?ReplicationReplication□ ScenarioNodes cache copies (or pointers to) content• object info can be “pushed” from nodes that have copies
• more copies leads to shorter searchesCaches have limited size: can’t hold everythingObjects have different popularities: different content
requested at different rates□ Q: How should the cache be shared among the
different content?Favor items under heavy demand too much then lightly
demanded items will drive up search costsFavor a more “flat” caching (i.e., independent of
popularity), then frequent searches for heavily-
requested items will drive up costs□ Is there an optimal strategy?ModelModel67□ Givenm objects, n nodes, each node can hold c objects, total system
capacity = cnqi is the request rate for the ith object, q1 ≥ q2 ≥ … ≥ qmpi is the fraction of total system capacity used to store object
i, ∑pi = 1□ ThenExpected length of search for object i = K / pi for someconstant Kas soon as object foundNetwork “bandwidth” used to search for all objects:
B = ∑qi K / pi□ Goal: Find allocation for {pi} (as a function of {qi}) to minimize B□ Goal 2: Find distributed method to implement this allocation of
{pi}Some possible choices for {pi}Some possible choices for {pi}□ Consider some typical allocations used in practiceUniform: p1 = p2 = … = pm = 1/m• easy to implement: whoever creates the object sends outcn/m copiesProportional: pi = a qi where a = 1/∑qi is a normalizationconstant• also easy to implement: keep the received copy cached□ What is B = ∑qi K / pi for these two policies?Uniform: B = ∑qi K / (1/m) = Km/a
Proportional: B = ∑qi K / (a qi) = Km/a□ B is the same for the Proportional and Uniform
policies!In between Proportional and Uniform
In between Proportional and Uniform
69□ Uniform: pi / pi+1 = 1, Proportional: pi / pi+1 = qi / qi+1≥ 1□ In between: 1 ≤ pi / pi+1 ≤ qi / qi+1□ Claim: any in-between allocation has lower B than B
for Uniform / Proportional
□ Proof: Omitted here□ Consider Square-Root allocation: pi = sqrt(qi) /
∑sqrt(qi)□ Thm: Square-Root is optimal
□ Proof (sketch):Noting pm = 1 – (p1 + … + pm-1)write B = F(p1, …, pm-1) = ∑m-1 qi/pi + qm/(1- ∑m-1 pi)
Solving dF/dpi = 0 gives pi = pm sqrt(qi/qm)Distributed Method for
Square-Root AllocationDistributed Method for
Square-Root Allocation70□ Assumption: each copy in the cachedisappears from the cache at some rate
independent of the object cached (e.g.,
object lifetime is i.i.d.)□ Algorithm Sqrt-Cache: cache a copy of object
i (once found) at each node visited while
searching for object i□ Claim Alg