Architecture of Distributed Systems
202401181241
Status: #idea
Tags: DC
Architecture of Distributed Systems
Layered Architecture Style
- Each layer provides a specific service
- There should be NO dependency between the layers (with the exception of the interface)
- Very common in Distributed Information Systems
- TODO image from slide 4
- Hybrid layering
- Allows you to skip a layer for performance layers
- An upcall can be similar to an interrupt. Hence, the handle would be ISR. Upcall is raising the interrupt.
- OS defines placeholders called handlers, which it calls
- The OS doesn't know who it calls.
does not know anything about the implementation of , so you can change the implementation - Eg. Sockets in NetProg. How TCP is implemented is hidden. All you know is that you selected you want to use TCP on a particular port
- Eg. Search engine (TODO pic from slide 6)
Service Oriented Architecture
Object-based
- TODO image form slide 7
- Performs 2 compilations
- Normal
- Second compiler
- Creates code for proxy on client. It has same method as the remote object. The code is code for network communication.
- Creates a skeleton for the server code
- Eg. RMI compiler in Java
Resource-based
- Used in Azure
- REST architecture
- Trivia: Created by a university student as a part of PhD thesis
- All services offer the same/similar interface
- Server is stateless. It does not remember previous requests. Helps in scaling
- Operations are similar to CRUD (in DBMS)
- PUT - Create
- GET - Read
- POST - Update
- DELETE - Delete
- Eg. Amazon S3
- The objects are files
- Buckets are similar to directory. But you can only put an object in a bucket. No bucket in a bucket
- Not using REST APIs allows you to use parameters, where you can have compile-time checks to catch errors
- Non-REST APIs are easier to understand the function of each API.
- Published interface -> More stable and generic interface, which does not change. So server can change a lot of things, and overall structure remains the same
- Neither is strictly better
Microservice Architecture
- Develop an application as a suite of small components
- Implements a business capability and is independently deployable
- Independently replaceable and upgradable, hence loosely coupled
- Published interface (does not change)
- Each microservice can be replicated a different number of times.
Design Principle
- Create a service that is based on business functionality
- Smart endpoint, dump pipe
- Put all your intelligence in the module itself.
- Do not put intelligence in the pipe (where the data flows)
- Do not assume that there is any intelligence in the pipe. Keep the communication as simple as possible
Design of Data Persistence
- Multiple types of data store (not necessarily relational)
- TODO Slide 13 of arch
Publish-Subscribe Style
- System as a collection of processes
- Event based - Events are produced and subscribed to
| Temporally Coupled | Temporally Decoupled | |
|---|---|---|
| Referentially Coupled | Direct (Phone call) | Mailbox (E-Mail) |
| Referentially Decoupled | Event-based | Shared data space |
Info
Temporally coupled processes need to be running at the same time
Note
Kafka is a light-weight and extremely scalable event-based data streaming tool
Example: Linda Tuple Space
- 3 simple opertations
in(t): Remove tuple matching templatetrd(t): Obtain a copy of a tuple matching templatetout(t): Add tupletto the tuple space. Calling it twice will create 2 copies
inandrdare blocking operations. Caller will block till tuple is found/becomes available
Middleware
- A software layer, on top of the OS of distributed nodes to assist distributed applications
- Provides
- Inter-app communication
- Security
- Accounting
- Failure masking (recovery)
- Design patterns
- Wrapper: Component that offers and interface for the clinet and solves incompatible interface problem
- Amazon S3: Storage is RESTful API or boto library
- 1-1 communication
- Legacy software use this
- With
applications adapters
- Brokers
- With
applications: adapters - Sophisticated brokers are called application servers
- With
- Wrapper: Component that offers and interface for the clinet and solves incompatible interface problem
Interceptors
- TODO: Image of Slide 19
Example: Network File System (NFS)
- Each NFS server provides a standardised view of its local file system
- TODO Image from slide 4
Example: Web Server
TODO Image from slide 5
- Common Gateway Interface (CGI)
- Can also have server side scripting
<strong><?php ech $_SERVER['REMOTE_ADDR']; ?></strong>
Different Types of Distribution
- Vertical distribution
- Layers are distributed in multiple machines
- Client server
- Horizontal distribution
- Both client and servers are physically split and distributed in different machines
- Peer to peer
- All processes are equal
- Each process will act as a clinet and server
Attention
Each machine need NOT be running identical functions
Peer to Peer
Structured
- Node adheres to a particular topology (ring, grid, tree)
- Efficient data lookup
Hypercube
- Distributed hashing example
- P2P node stores
<K, V>pair - Key is the index to route to the node with the data
TODO: Image of n-dimensional cube topology (Slide 7)
- P2P node stores
Chord/Ring
- Each node has m bit ID
- Only nodes in bold actually exist
- Store data with the smallest ID
- Each directed edge is a shortcut.
- They are created in a way S.T the shortest distance between
is
- They are created in a way S.T the shortest distance between
- Go to the node which is preceding 3, and closest to 3. It can only pick its neighbour, hence 1 is not picked.
- Insert
before - Links to
now go to - Copy data from
to whose
informs departure to its and - Copy its data to
Unstructured
- Random graph - Each node maintains a random list of neighbours with probability
- Searching
- Flooding
sends request for for all neighbours - If a neighbour
has seen it, it ignores - If
has , it sends it to the node who requested the data from it (need not be ) - Else,
forwards the request to its neighbours
- Flood to
randomly choses neighbours - After
steps, nodes will have been reached - With having
fraction of nodes having data - We will have found the data if
- After
- Random walk
passes request for to randomly chooses neighbour - If
does not have , it forwards it to one of its random neighbours
- Probability that data is found after
attempts - Expected number of nodes to be probed before finding the data
- Flooding
Example: BitTorrent Example
- TODO: Image from slide 11
- Uses the tracker model
- In an alternative implementation of BitTorrent, a node also joins a DHT to assist in tracking file downloads
- Each node acts as a tracker for a small set of files
- New peer will communicate only with those peers and not with the initial tracker
- Initial tracker for the requested file is looked up in the DHT through a magnet link
- DHT indexes the torrent file (and not the pieces of the file)
Hybrid Architecture: Edge-Cloud
- Computation happens close to the cyber-physical system
- Industry automation
- Each factory - Edge
- Cloud is used to connect multiple factories
References
- Published Interface - Article by Martin Fowler in IEEE Software
- Network File SystemPreviously on "The Bold Type." You are the future
- Trackerless Torrents
- How does DHT get bootstrapped?