Google system design interview: Design Spotify (with ex-Google EM)

Subtitles section Play video

[Music] hello there and welcome to this system design mock interview today we want to show you a
really high quality best-in-class system design answer so that you can use it to prepare for
your interviews and use it to know exactly what standard you should be aiming for with us they
give us the answers the candidate today I have a former engineering manager at Google it's Mark
Mark how are you doing. I'm doing great Tom thanks for having me I'm looking forward to doing this
oh thanks for thanks for coming on it's great to have you before we get started
just want to quickly give people an idea of your background as a as an engineering manager
yeah I was an engineering manager at Google for 13 years and worked on several large-scale systems
with lots of great engineers and so I I hope I've learned a little bit from them and it'll
be interesting to see me on the other side of the fence of uh of uh doing the being the interview
candidate here so I'm looking forward to it I'm sure you do a great job yeah and Mark also is a
coach on our platform so if you need some expert feedback ahead of your interviewer Google Facebook
anywhere any tech company then do check him out on our platform okay uh if you're ready Mark
let's crack straight into it the question I want to ask you today is how would you design Spotify
okay
uh so yeah so uh Spotify the probably one of the most popular music streaming services
out there I guess uh so let's let me let me think about that just for a moment [Music]
um because there's lots there's a lot of aspects of Spotify here let me see if I can actually maybe
I'll just take some notes here so that I can kind of I think I think through this if that's okay and
and can I get some uh make some assumptions so I have my notes I'll just put this right in here
and then if it's okay I'll just use this for notes as well as sort of the design diagrams
and things like that later on yeah sounds good so yeah so let's yeah so let's talk about just
some different sort of Spotify okay so I for Spotify I can I can think of things I mean
obviously there's songs music that's the main thing uh there's you know you have playlists
I mean there's users there's uh artists uh I guess Spotify even has things like podcasts
that you know you can listen to and things like that so there's a lot to it and so in terms of the
system design for today uh is there is there like can we can we constrain this a little bit just
so that we can actually get through a design what would you what would we would be good to focus on
here yeah let's constrain it let's limit ourselves to let's look at finding and playing music
okay okay so let me uh so use cases finding uh and playing music I'm just gonna write it down
like verbatim here great yeah I think enough to talk about I think so yeah okay good so uh
if I if I think about that let me let me try and ask a little bit or think a little bit about
the use cases a little bit more so my typical use case then as a user if I'm signed up to Spotify
and I have an app on a phone uh I am going to browse some music maybe I'll find a particular
search for a particular song but but then the most core thing is that I play a song and it's
you know coming back through my phone maybe my car or whatever wherever I might be headphones uh
to listen to it with big broad questions like this you need to constrain the problem make it solvable
in the space of an hour and Mark did this well establishing the use cases we wanted to focus on
if you're asked to design an already existing system in this case Spotify share what you
know about it with the interviewer they can correct you or fill you in if you have any
knowledge gaps so if I think about other if I think about metrics here let's think
about size let's let's do some quick quick drilling down into into some numbers here
and pardon me while I get this going here so in numbers so tell me about how many users
you're thinking about here for for this design yeah let's save a billion users
okay one billion users okay what about number of songs yeah I think I saw I think I saw it
somewhere that Spotify has about 80 million songs but let's say we need to have capacity
for 100 million on there okay all right so 100 million songs a billion users and you know we're
we're going to focus on the song piece of it so but let me do a little bit of uh of math here
or a little bit of thinking and you can I can I'll just you know double check with you to
make sure that this makes sense but for from what I know think about a typical mp3 audio is going
to be about for a typical song is is about five megabytes and I mean I think that depends on oh
the encoding of the song how long the song is and things like that but uh if that if that if that's
okay we can just kind of make that assumption is that fair yeah that sounds good to me
[Music]
yeah I mean I think you can encode these things and you know really low
quality like 96 kilobits per second all the way up to 320 kilobits but right in
the middle 12860 is about maybe about that so all right so let's assume that
and so then let's do let me just extrapolate from this so if you've got five megabytes per song
you've got 100 million songs so I think uh that translates to I'm trying to do my math here so
100 million would be uh going to a thousand times that would be a 100 billion and then 100 trillion
would be a million times times five would be 500 trillion so we're talking about uh total
audio is something like 500 terabytes which is I guess it's the same thing is half a petabyte of
data and so that's I'm making that assumption and so then depending on how you replicate this data
because you typically want to have these songs in multiple you know have multiple copies of these
things so they don't get lost Etc and if some replica is down so you know maybe you do let's
say oops sorry my keyboard here 3x replication so that would mean like one and a half petabytes of
of raw audio data and then and then each song in terms of the metadata meaning the you know
the the song title and things like that and and artists and all these things the metadata
is probably not very big it might be I don't know you know 100 bytes per song or something like that
per song of metadata and so you might wind up with okay so let me do the you have it and that's uh uh
billion 10 billion so that's only 10 gigabytes of of song metadata it's not really very much
and even if you say it's it's a kilobyte then that's 100 gigabytes so it's not very much and
let me just double check my math here 100 million times 100 really 100 bytes yeah I
think that would be 10 gigabytes so even if we said if we round it up dramatically and said
100 gigabytes that's just not a ton of data now in terms of users uh user data you might
have you know again like a maybe you have a kilobyte oops kilobyte per user metadata I
guess it's all metadata and so uh that might be [Music] times a billion users so that would be a
terabyte of data approximately so just to kind of so this is just me doing some quick metrics
quick calculations to get a rough back of the envelope of how big these things are
uh so does that make sense so far does that seem yeah that will make sense so far
before starting your high level design as marketed get some metrics to help you identify any higher
level decisions that might be influenced by the scale of the system Mark asked the right questions
about the number of users and number of songs I made some rough calculations without getting
bogged down in the numbers what Mark didn't calculate was the traffic how many songs are
streamed per day or per second even that would have helped him figure out the scaling of the
web servers how many he'd need and how much bandwidth he'd need however since this didn't
impact his high level design it was okay not to go into the details of scaling at this point
okay so let me let me maybe think about some basic components here and I'm going
to draw them but I I'll get into them maybe a little bit more in detail later
okay
so some basic components that I would see so let me let me start with something that is my my best
attempt at a mobile phone here so you cut your Spotify app and make this smaller so it fits into
the thing so you've got your app which is on your phone typically and I'm gonna assume here that
we're talking about the phone app because that's very common uh and so you've got your your phone
the phone is going to be talking to uh an application server a Spotify application
server somewhere or web server I guess and so let's draw some let
me just put a like over here just say there's a Spotify web server uh and uh
and there's not going to just be one there's going to be lots of them and assume that that's a lot
and then of course just because this is a standard thing that you do when you have applications uh
you know in the cloud here this is I guess part of the cloud you've got a load balancer and so let me
draw let's assume that we've got an arrow here that's talking so Spotify app's talking through
the load balancer ultimately to these to these web servers and so those are the those are some some
sort of key components here let me think about this a little bit further yeah I think that's
good so far okay and then of course the most important thing or one of the most important
things here let me see if I can find a I know that there's a shape here here we go that's a that's a
good shape I'm just going to say call this uh the database and let me make this a little bit more
uh let's see that's what I wanted okay so there's database and so the web servers are going to be
talking to the database and getting stuff from basically reading information writing information
to and from the data reading and writing information to and from the database
so those are kind of the very high level components in system design interviews good dual
bandwidth communication is important that's to say you need to communicate well via both drawing and
speaking you should practice this thoroughly before your interview Mark did this well he
started drawing the components quite early on in his answer and this allowed the interviewer to
follow along easily it's okay to have simpler and fewer components at this point in the interview
you can then break it out later in your answer as Mark did when he splits his database in two and
I probably am thinking about this I think this is obviously oversimplified and I'm not going to
redesign Spotify uh you know the existing full application as exists but I do think I can get
a little bit more detailed here by I think I'm thinking about the data again because it's it's
the data is important so I'm thinking about the metadata and the user data and the audio data
and they're different very different types of data so I'm going to actually I'm going to split this
database here into two things and I'll explain that or I'll try to explain them just a little
bit come on there we go make this just a little bit more uh so I'm going to split this up and say
that I'm gonna there's a like a song audio database
and then there's going to be like a metadata database so this is users uh songs
uh what else I mean are ultimately they're probably be artists and Etc things like that
but we're focusing on the songs I think in the in the music and so on this is also a database
of course song audio database and metadata and so I think let me make this a little bit more
uh this Spotify web server is actually going to talk to this to The Meta oops the metadata and
uh and the audio database it's actually going to be talking to two databases here yeah that's
interesting can you can you can you go into a bit more detail about why you'd split them into two
yeah yeah I'm kind of making that uh just assuming that or making that yeah so okay let me let me
to try and I wanted I want to answer let me try and think of
this in terms of the technologies that I might use I might use something like
Amazon S3 here and I might use Amazon RDS and so now let me explain what I mean by that so
the type of data that the actual MP3 files those I think of as immutable data they're just Blobs of
data they're files they're five megabyte roughly on you know on average files and they're not going
to change they're just being streamed they're being you know stored and streamed essentially
and so a blob database which is what S3 is lends itself really well towards that and it could be
something like Google drive or there's other you know technologies that are sort of just
document like blob storage uh systems but Amazon S3 lends itself really well to that and it scales
uh greatly you can just add more and more and more and more and it scales uh linearly which is great
and and so and then you can connect it up to to be able to stream the data and things like that so
that type of data and the access patterns for that which is really mostly read you're just streaming
this data you're never going back and forth and writing to it that would be why I would want to
put the songs in that kind of a database now S3 is great for that works really well but the data
the metadata which is the the users and their information and the songs and their information
and uh maybe they get updated maybe you're searching across them and having to do queries
to find songs for a particular genre or artists or things like that or you're trying to update your
users like if I'm playing us playing a song and I want to remember where I left off so that when
I continue playing it it remembers that I'm going to be modifying the data quite a bit or going back
and forth between that database and doing these queries and you can't really do queries over S3
but you can over like a relational database so MySQL would be another option uh I just chose
RDS here really Amazon's relational database as a as a because I'm going with the AWS family here
but yeah so uh let me let me try and try and just write something here a little bit more in terms of
in terms of songs you would have things like a song ID you would have a song Maybe
maybe a URL like that you'd use for sharing uh you would have an artist a genre maybe
there's a link to album cover and maybe there's the link to the to the audio I
guess so this is too too big to fit so this is the type of data that would be stored in
uh let me draw this in here again like in that in that database and then just the Raw
I mean I almost don't need to write this here so song MP3
would be stored in this database down here so yeah so I think that the access patterns the types of
things that you want to do the size of the data because the size of the data here I think we said
was going to be well half a petabyte in one in one for one replica so S3 lends itself well to
that the data here is going to be you know in the you know maybe terabyte range uh something
like that and and it's going to have lots of queries over it maybe some updates and things
like that so that's why I would separate these out is because I think the access patterns the
size of the data and the type of uh the type of queries that you need to be able to do over
would be very different does that make sense yes does that answer yeah yeah it does thank you yeah
I think that makes sense at this point of your answer when you've laid out the main components
it's good to start identifying some technologies as Mark did here starting with the database his
design made sense breaking up the databases into audio data and metadata but he could have
explained why he was doing this up front rather than waiting for the interviewer to ask why in
general try to be upfront about your decision making yeah feel free to feel free to continue
so okay so let's say we have these two databases and I'm just trying to think about now the actual
the two the use case sorry the use cases we talked about so finding and playing music [Music]
so for finding music I think uh the the Spotify app you know would need
to uh request a do it do a fine music and so the user probably you know either is typing
in like an artist's name or maybe they're selecting a bunch of filters I actually I'm
not that familiar with that's how possible that is but but let's say I'm searching for music uh
for an artist with a particular genre and so ultimately somehow I'm filling in this query
in my app either by clicking on some things or typing some things or a combination and then that
request to find music is going to be sent to this web server going through the load balancer to pick
up web server it's going to go there and then that web server is going to do a query issue a query uh
translated query to to the relational database to find a bunch of songs and return a list of songs
and return that back up to the app with whatever metadata there there is and let's say that I found
it it finds I don't know 100 songs that match my uh request for
uh uh Korean pop music or something like that and so I I now get back 100 songs and I can look
at those and now once I've gotten that list of songs and by the way for for that query where I'm
searching for music I don't touch the mp3s at all I don't need to go to that database at all I just
need to go to uh to the metadata database and do this query and because it's a relational database
that that query can happen pretty efficiently Etc so now I'm I return that information back
and that and then I display that to the user does that does that seem does that
make sense in terms of a finding music yeah yeah that makes sense that's good
um okay and so now I have a list of songs and maybe I maybe there's like the one of the ones
in that list is like oh yeah that's what I was looking for and so I click on that uh song to
play it I want to I want to hear that hear the song so that's that second sort of part of the
use case that we talked about which is playing music so now to play music now this gets a little
interesting and this is I think I'm going to wind up maybe changing what I'm thinking about here a
little bit but but let me let me just talk through it so I I click on the play button or I click on
the song to play it and that translates into a request from the app to the web server to start
playing a song playing the song and it has an ID like I mentioned as like an ID in it and so now
the the web server based on that song information ID maybe has to go to this
database here to look up the link uh what did I call it oh an audio link maybe that's an Mp3
link I don't know something like that that's a better term for it I don't know so the Mp3 link
and so that Mp3 link or audio link would be returned and now the Spotify web server has to
go to the database where the actual audio is stored in Fetch that fetch that database now
you could imagine streaming that five megabytes from this database so chunk by
chunk it comes back and chunk by chunk it goes up to the Spotify app in order to do this you
need like a websocket connection so you need a kind of a long-standing connection between the
application and this web server so that you can chunk that you can send the data back in chunks
but I'm not sure whether I would need to do that or whether because it's only five megabytes that's
possibly small enough to just fit in memory you know read from the uh from S3 and fill fit into
memory and then have that particular web server chunk it back from there I think that might be
better because it would eliminate the possible lag between the database so you don't start streaming
the audio until you have it in memory that that might be something that I I would consider doing
so now let me think about so that that would be does that make sense in terms of like the
the playing like you're you start playing and obviously you can control it you can pause playing
and so on but does that make sense so you you you start playing the web server gets the request to
play it maybe gets the information about where to go over here to get the the S oops the S3 uh
storage audio storage reads that back it's only five megabytes that should be should not take
it should be almost instantaneous and then it starts streaming that back to the application
does that make sense yeah that makes sense uh please please carry on yeah so all good so far
Okay so this almost sounds too good to be true or sounds too easy I I think it's got
to be more complicated than that okay so one thing I'm just realizing here is uh
probably out of these what did we say we said 100 million songs
there's there's probably a lot of stuff that is uh I'm going to use the term Indie artists or
something like that stuff that few people uh not very many people listen to or isn't very popular
and on the flip side of that I'm thinking about like what could go wrong here well if
uh like BTS which is a again a Korean pop pop band uh if they let's say they release a new song and
it's like hey everybody wants to listen to this is super oh have you heard the latest song Etc if
you've got all of these web servers all fetching that same thing uh that same song and let's say
you've got you know requests like it's released and in the next minute you've got I don't know
uh how do we you said a billion users okay let's say that uh 10 million users all request the song
all at the same time or something like that because it's just been released and they're
following you know social media is something you could easily overload in terms of bottlenecks you
could Pro you could overload possibly uh that bucket or that bucket excuse me that particular
uh song uh that file or whatever however it's stored in AWS and it could be stored different
ways I'm using it S3 here you could overload uh AWS or S3 there and you could also possibly be
you know like streaming like loading up all these these web servers with the same song streaming the
same thing back so it's a lot of bandwidth Etc so a better thing to do and a common thing to do for
stuff like this is to to use what's called a CDN content delivery Network which is like a it's a
cache so this this helps reduce the amount of load on on back ends so let me draw something here let
me just I am going to pick uh what shape am I going to pick I'll just pick randomly something
like this so maybe over here so I'm going to say here this is a CDN which is an song audio cache
this is typing in and so what would happen so and by the way the Technologies here just to this this
this CDN a Content delivery network is usually very very close in terms of number of hops and
network connections to users so what would happen is let me draw another arrow here and let me draw
uh an arrow down to here so what what would happen here is that the first time that this song is this
new BTS newly released single is is requested the web server would read it and stream it like normal
but probably we need to make sure that these Spotify web servers are somewhat uh
they're keeping track of things and they're keeping track of you know which songs are being
requested which ones are hot they probably have a heat map the most recently requested songs and so
in that heat map as they see oh this song is now ever I've seen this requested the
fifth third time in you know in a minute or something like that
at that point in time what it might do is it might actually uh instead of just streaming
it back to itself or copying it back to itself excuse me it might actually load that into the
CDN and so I am trying to I'm going to have to draw another arrow I think in order to do that
because these things probably have to talk to the CDN and by the way I'm not a super expert on this
area in terms of how this works but there's some connection between this Edge caching this content
delivery Network which is just there for caching and these Spotify web servers so these web servers
are going to somehow notify the CDN hey you should be pulling this song this BTS latest song
pull that from uh the the MP3 the audio storage so the CDN you know would load that up and this now
has to work in conjunction with the application so that means that the application when it when
the person decides when I request the song I want to play this latest BTS song and I haven't had it
on my phone yet it's just been released before I go and I talk to the web server I might even
go and check in this in the application might actually go and check in this content delivery
Network to see is it there it might also be so and if it is there then I just read it from there
the other option of the possibility and again this is technology that I'm sort of not super familiar
with is that it goes and asks the web server just like it normally would and the web server
sends back a redirect saying hey I don't have it but you should go over here to this content
delivery Network this this Edge cache to to to fetch the song it's much closer to you you'll
get better performance and you won't be loading me up so much so I've got a lot of arrows here
but uh the the the standard flow for a song first time around is you know coming around
getting the metadata going around to reading the audio data the web server is now keeping track
and realizing oh you know what this is getting I'm getting multiple requests for this same song
now I'm going to tell the CDN go and fetch this song it's going to fetch it and so now
the application can I can it can be redirected for example to read that audio from there so that's a
lot of talking but I was realizing that caching here seems like it's a very important Point
in in this in this design and would help with the bottleneck of of hot songs if you will
does that is that absolutely yeah by the way that's a really good answer
okay and there's again in the AWS family here I'm going to use the family uh you know obviously if I
were a googlers you know I might be not be using this but there's a technology called cloudfront
and cloudfront is uh basically a Content delivery Network there's a bunch of other ones
uh gosh flask is that what it is I can't remember what the names of these of these Technologies are
but there's a bunch of other technologies that are very similar in nature that allow you to do this
so that would help with the bottlenecks and uh in terms of the the the caching here you could also
and this is part of the reason by the way I just stepping backwards here a little bit part of the
reason why I think it would be better initially when the when the web server is getting the song
for the first time for it to read the the entire MP3 into its memory because then if it's if if it
is getting multiple requests for a particular song uh then it doesn't have to go to the database it
can actually just feed it from from its memory and that's also a form of caching of course so it's
got a local cache essentially of of uh of these songs and you could imagine having a shared cache
that's shared across these web servers so we that could be another optimization but my point is that
if we have a cache in memory in these web servers that stores the songs then we're offloading the
the database a little bit if we then have the cache here at this the edge of the network so
this is that's what it's called Edge Network Edge caching then we're offloading the web servers
to to a large extent because the streaming this is optimized for streaming and so on and I think
if we go one step further actually I'm I'm sure and if I were designing it I would design the
application because these are on smartphones uh to to store this the songs that are played frequently
by the specific user locally in the local storage so it's another form of caching so then if it
finds the song in the local in its local cache essentially local store here on the phone uh then
you know you you wouldn't even need to go to the to the network at all right so you could actually
I'll even draw like this so it's like a little little cash a little local storage here of songs
so there's multi-layer caching I guess what I'm saying multiple levels of caching ultimately with
the goal to provide the best user experience make sure it's super fast and to then offload
obviously the system to to allow it to sort of scale and limit costs Etc so that's probably
enough on caching and I think I've gone gone you know gone deep a little bit there
be ready for the interviewer to ask you about particular components or different aspects of your
system and to talk in detail about them it's also good to talk through the flow of a particular use
case this helps make sure the interview is clear on it but also helps you test your design as you
talk through it as well as helping you identify any potential bottlenecks as it did with Mark here
Mark explained the purpose of the caching and also where caches might exist in the system caching
wasn't limited to the CDN and the app but also the web server he could have talked about caching
the metadata as well as the song data we've done caching uh that's all good uh now I wanted to ask
you about load dancing uh yeah how would you think about load balancing for for this particular app
yeah magic load balancing it's uh yeah a little a little bit of hand wavy here I think of load
balancing I mean so load balancing uh uh commonly is used to the load balancer's job here really
is to make sure that these web servers are don't are not overloaded so that they're providing
good service to the end users and they're not overloaded and overloaded can can mean
many things often it's in terms of CPU so you know you're getting requests in lots of requests coming
in uh and if if you're getting lots of requests coming in then just load balancing these requests
across the server so that there's roughly the same number of requests on each one if assuming they're
all equal would probably also result in the CPU utilization being about equal I think for this
application I would think about load balancing I might think about it a little bit differently
because we're streaming data and so I might be not instead of using CPU as my load balancing
uh metric to figure out how to distribute the load I might be looking at possibly Network bandwidth
like is is because if if a particular web server is not doing a lot from a CPU perspective but
if it is uh i o bound or network bound meaning it's it's hitting its limit of its if it of its
a network connection then uh I wouldn't want the load balancer to send it more traffic because
it's just going to bog down and you're going to get skips and things like that so I might make
this load balancer aware of multiple metrics one of them being Network maybe memory for caching
although you could probably you can kind of limit that possibly but maybe not CPU for this purpose
it might be requests out outstanding or current streams or something like that there might be
some other metrics but I'd want to I would want to make this load balancer maybe a little bit smarter
than just a typical request uh round robin load balancing scheme so I'm not sure if that's getting
it what you're asking yeah yeah I think that answers my question load balancing is not a one
size fits all Mark mentioned looking at different metrics as a way to load balance across different
servers which was a clever approach Mark was very open about the fact that load balancing isn't an
error he's an expert on and this is okay it's fine to be honest when you ask about things that are
at the limits of your knowledge okay cool yeah all right are you are you done with your design
or is there anything else that you want to add yeah let me think about that and give me give me
just a moment I I mean it again it you know I've way oversimplified this but ah I guess I think if
I were to also think about this at a global scale I mean Spotify is a global app and so
it's it's possible so uh replication I didn't really talk about replication other than to do the
math to say hey maybe three three x replication and replication right why do we do this well we do
do it to make sure that we have uh data available when there's an outage so if so you know if I if
I were to you know I mean like do this right to indicate three replicas but that's that's a little
naive and the same thing for the metadata of course but the replicating the data is
not just for availability and downtime it's also you would want to place those replicas
closer to where the users are and so from a Global Perspective you might have music that
is more local like maybe European punk rock is you know more listened to more in Europe and
maybe maybe BTS is you know because it's Korean pop is maybe it's more popular actually in Korea
or in Asia and so you might want to have the replicas of those songs uh and and maybe even
the metadata be more locally uh represented more locally so that you can get to it faster you don't
have to cross an ocean to get to the to the data and uh you know to reach it so kind of a Geo aware
strategy of data placement and possibly replication strategy that might be a a
refinement I think that I would might make to this design to make it just a little bit more
uh performant effective etc etc so yeah does that does that make sense yeah yeah I think nothing is
just interesting point to add it's a good idea to wrap up your answer by referring back to the
requirements laid out at the beginning of the interview and confirming that your design meets
them if you have time it's always nice to think big and take a quick look at the problem from a
different dimension that could make it more complex Mark did this with his geolocation
idea he didn't go into detail but another interior might have explored this further and it could have
opened up a whole new aspect of the design overall an excellent answer from Mark yeah great job Mark
I think that was a a really good approach and yeah how did you feel now now looking back now
the interview is over let's say you've uh you've walked out the interview room
and read the big sigh of relief how how are you feeling that it's gone
I feel pretty good about it it's uh yeah it's definitely different being on the other side of
things and uh I I know that I'm sure I miss things and I'm I'm you know people are watching this and
and uh wind up scheduling some a session with me I'm you know happy to take the the feedback on
some things that I missed I'm sure I've missed things but uh but yeah it's fun fun to do this
though cool good stuff well uh yeah thanks very much and uh yeah thanks everyone for watching
hello I really hope you found that useful if you did you can like And subscribe and why not come
visit us at igotanoffer.com there you can find more videos useful Frameworks and question guides
all completely free and you can also book expert feedback one-to-one with our coaches from Google,
meta, Amazon Etc thank you and good luck with your interview [Music] all right
[Music]