One thing that I haven't seen discussed in the comments is the inherent vulnerability of S3 pricing. Like all things AWS, if something goes sideways, you are suddenly on the wrong side of a very large bill. For instance, someone can easily blow your egress charges through the roof by making a massive amount of requests for your assets hosted there.
While Cloudflare may reach out and say 'you should be on enterprise' when that happens on R2, the fact they also handle DDoS and similar attacks as part of their offering means the likelihood of success is much lower (as is the final bill).
Typically you would use S3 with CloudFront for hosting. S3 provides no protections because it's meant to be a durable and global service. CloudFront provides DDoS and other types of protection while making it easy to get prepaid bandwidth discounts.
Just one data point, but adding Cloudflare to our stack (in front of "CloudFront with bandwidth discounts") took about $30k USD per year off our bandwidth bill.
In my experience the AWS waf and ddos mitigation are really expensive (min $40k per year contract) and are missing really basic ddos handling capabilities (last I evaluated it they did not have the ability to enforce client js validation which can be very effective against some bot networks). Maybe it has evolved since but Cloudflare enterprise was cheaper and more capable out of the box.
Also, once you are on Enterprise, they will not bug/charge you for contracted overages very often (like once a year) and will forgive significant overages if you resolve them quickly, in my experience.
I'm not really sure what point you're trying to make here. S3 bills you on, essentially, serving files to your customers. So yes if your customers download more files then you get charged more. What exactly is the surprising part here
The surprise is any ne'er-do-well can DDoS your bucket even if they aren't a customer. Genuine customer traffic volume will probably be known and expected, but putting an S3 bucket in the open is something like leaving a blank check on the internet.
It's a bit unfair to characterize that as a surprise on how much S3 bills you, no? The surprising part here is lack of DDoS protection on your end or leaving a bucket public and exposed. AWS is just charging you for how much it served, it doesn't make sense to hold them to a fault here.
> The surprising part here is lack of DDoS protection on your end or leaving a bucket public and exposed.
It doesn't take anything near DDoS. If you dare to put up a website that serves images from S3, and one guy on one normal connection decides to cause you problems, they can pull down a hundred terabytes in a month.
Is serving images from S3 a crazy use case? Even if you have signed and expiring URLs it's hard to avoid someone visiting your site every half hour and then using the URL over and over.
> AWS is just charging you for how much it served, it doesn't make sense to hold them to a fault here.
Even if it's not their fault, it's still an "inherent vulnerability of S3 pricing". But since they charge so much per byte with bad controls over it, I think it does make sense to hold them to a good chunk of fault.
If you want to hire someone to walk your dog you probably won't put an ad in the New york times to a head hunter that you will pay by the hour with no oversight and it would be totally unfair to that head hunter when you don't want to pay them for the time of all those interviews. But an infinitely scalable service you somehow can't put immediately terminal limits on is somehow fine on the cloud.
it loses trust with customers when the simple setup is flawed.
S3 is rightly built to support as much egress as any customer would want, but wrong to make it complex to set up rules to limit the bandwidth and price.
It should be possible to use the service, especially common ones like S3 with little knowledge of architecture and stuff.
There was a backlash about being billed for unauthorized requests. It's since been updated[0]. I don't know that all affected was retroactively refunded.
I've benchmarked R2 and S3 and S3 is well ahead in terms of latency especially on ListObject requests. I think R2 has come kind of concurrency limit as concurrent ListObject requests seem to to have increase failure rate when serving simultaneous requests
I have a few of the S3-like wired up live over the internet you can try yourself in your browser. Backblaze is surprisingly performant which I did not expect (S3 is still king though)
Is R2 egress actually free, or is it like CFs CDN egress which is "free" until they arbitrarily decide you're using it too much or using it for the wrong things so now you have to pay $undisclosed per GB?
Do you have any examples of the latter? From what I remember reading, the most recent case was a gambling website and cloudflare wanted them to upgrade to a tier where they’d have their own IPs. This makes sense because some countries blanket ban gambling website IPs.
So apart from ToS abuse cases, do you know any other cases? I ask as a genuine curiosity because I’m currently paying for Cloudflare to host a bunch of our websites at work.
Put another way, if Cloudflare really had free unlimited CDN egress then every ultra-bandwidth-intensive service like Imgur or Steam would use them, but they rarely do, because at their scale they get shunted onto the secret real pricing that often ends up being more expensive than something like Fastly or Akamai. Those competitors would be out of business if CF were really as cheap as they want you to think they are.
The point where it stops being free seems to depend on a few factors, obviously how much data you're moving is one, but also the type of data (1GB of images or other binary data is considered more harshly than 1GB of HTML/JS/CSS) and where the data is served to (1GB of data served to Australia or New Zealand is considered much more harshly than 1GB to EU/NA). And how much the salesperson assigned to your account thinks they can shake you down for, of course.
> Cloudflare’s content delivery network (the “CDN”) Service can be used to cache and serve web pages and websites. Unless you are an Enterprise customer, Cloudflare offers specific Paid Services (e.g., the Developer Platform, Images, and Stream) that you must use in order to serve video and other large files via the CDN. Cloudflare reserves the right to disable or limit your access to or use of the CDN, or to limit your End Users’ access to certain of your resources through the CDN, if you use or are suspected of using the CDN without such Paid Services to serve video or a disproportionate percentage of pictures, audio files, or other large files. We will use reasonable efforts to provide you with notice of such action.
I was going to say that it's odd, then, that reddit doesn't serve all the posts' json via a free account at cloudflare and save a ton of money, but maybe actually it's just peanuts on the total costs? So cloudflare is basically only happy to host the peanuts for you to get you on their platform, but once you want to serve things where CDNs (and especially "free" bandwidth) really help, it stops being allowed?
Their ToS enforcement seems weak and/or arbitrary. There are a lot of scummy and criminal sites that use their services without any issues it seems. At least they generally cooperate with law enforcement when requested to do so but they otherwise don't seem to notice on their own.
Good to know. Please make an uncontroversial list of all the human activities that you think shouldn't be allowed on cloudflare (or perhaps in general). Then we can all agree to abide by it, and human conflict will end!
Cloudflare is a company, not a public utility. If they want to disallow any sites that make fun of cuttlefish they get to do that. If you want a CDN that follows the rules of a public utility I think you're out of luck on this planet.
In addition to this, if CFs say...payment provider, hated people making fun of cuttlefish, it might make sense for CF to ban marine molluscs maming there also.
Nice job painting CF as the had guy. They do NOT provide services to such, again and again they have terminated such for breach of TOS and cooperated with the legal system.
One thing to think about with S3 is there's use cases where the price is very low which the article didn't mention.
For example maybe you have ~500 GB of data across millions of objects that has accumulated over 10 years. You don't even know how many reads or writes you have on a monthly basis because your S3 bill is $11 while your total AWS bill is orders of magnitude more.
If you're in a spot like this, moving to R2 to potentially save $7 or whatever it ends up being would end up being a lot more expensive from the engineering costs to do the move. Plus there's old links that might be pointing to a public S3 object which would break if you moved them to another location such as email campaign links, etc..
I think the most reasonable way to analyze this puts non-instant-access Glacier in a separate category from the rest of S3. R2 doesn't beat it, but R2 is not a competitor in the first place.
Yes, but if it's your third location of 3-2-1 then it can also make sense to weigh it against data recovery costs on damaged hardware.
I backup to Glacier as well. For me to need to pull from it (and pay that $90/TB or so) means I've lost more than two drives in a historically very reliable RAIDZ2 pool, or lost my NAS entirely.
I'll pay $90/TB over unknown $$$$ for a data recovery from burned/flooded/fried/failed disks.
Retrieval? For an external backup? If I need to restore and my local backup is completely down, it either means I lost two drives (very unlikely) or the house is a calcinated husk and at this point I'm insured.
And let's be honest. If the house burns down, the computers are the third thing I get out of there after the wife and the dog. My external backup is peace of mind, nothing more. I don't ever expect to need it in my lifetime.
High 3 and 4 figures wouldn't occur for personal backups though. I've done a big retrieval once and the cost was literally just single digits dollars for me. So the total lifetime cost (including retrievals) is cheaper on S3 than R2 for my personal backup use case. This is why I struggle to take seriously any analysis that says S3 is expensive -- it is only expensive if you use the most expensive (default) S3 product. S3 has more options to offer than than R2 or other competitors which is why I stay with S3 and pay <$1.00 a month for my entire backup. Most competitors (including R2) would have me pay significantly more than I spend on the appropriate S3 product.
Curious, did you go through the math of figuring out how much the initial file transfer and ongoing cost will set you back (not a lot from the sounds of it). Should be way to do, but I’ve just not found the time yet to do that for a backup I’m intending to send to S3 as well
“Here’s a bunch of great things about CloudFlare R2 - and please buy my book about it” leaves a bad taste in my mouth.
Also, has CF improved their stance around hosting hate groups? They have strongly resisted pressure to stop hosting/supporting hate sites like 8chan and Kiwifarms, and only stopped reluctantly.
Has AWS improved their stance around hosting resource draining robots.txt defying scrapers and spiders?
Every large company does business with many people / orgs I don't like. I'm not defending or attacking AWS or CF, but merely stating the deeper you dig the more objectionable stuff you'll find everywhere. There are shades of gray of course, but at the end of the day we're all sinners.
I don’t have to support 8chan or KiwiFarms to say that Cloudflare has absolutely no role in policing the internet. The job of policing the internet is for the police. If it’s illegal, let them investigate.
If they are known bad actors, let the police do the job of policing the internet. Otherwise, all bad actors are ultimately arbitrarily defined. Who said they are known bad actors? What does that even mean? Why does that person determining bad actors get their authority? Were they duly elected? Or did one of hundreds of partisan NGOs claim this? Who elected the NGO? Does PETA get a say on bad actors?
Be careful what you wish for. In some US States, I am sure the attorney general would send a letter saying to shut down the marijuana dispensary - they're known bad actors, after all. They might not win a lawsuit, but winning the support of private organizations would be just as good.
> they certainly don’t have to condone it via their support
Wow, what a great argument. Hacker News supports all arguments here by tolerating people speaking and not deleting everything they could possibly disagree with.
Or maybe, providing a service to someone, should not be seen as condoning all possible uses of the service. Just because water can be used to waterboard someone, doesn't mean Walmart should be checking IDs for water purchasers. Just because YouTube has information on how to pick locks, does not mean YouTube should be restricted to adults over 21 on a licensed list of people entrusted with lock-picking knowledge.
Cloudflare protects these organizations. Cloudflare goes far above and beyond what most other companies do, and I personally can't wait to see Cloudflare held liable for content they host for which they ignore legitimate complaints.
Imagine if I were to call your phone every hour, on the hour, and my position was that it's not illegal until someone reports it as a crime and the police or a court contacts me to tell me to not do it. That's Cloudflare. They're assholes, and insisting that they're not responsible for anything, any time, until there's a court order is just them being assholes.
Ouch. Object versioning is one of the best features of object storage. It provides excellent protection from malware and human error. My company makes extensive use of versioning and Object Lock for protection from malware and data retention purposes.
As @yawnxyz mentioned, versioning is straightforward to do via Workers (untested sample: https://gist.github.com/CharlesWiltgen/84ab145ceda1a972422a8...), and you can also configure things so any deletes and other modifications must happen through Workers.
I have tried to find a CDN provider which would offer access control similar to Cloudfront's signed cookies but failed to find something that would match it.
This is a major drawback with these providers offering S3 style bucket storage because most of time you would want to serve the content from a CDN and offloading access control to CDN via cookies makes life so much easier. You only need to set the cookies for the user's session once and they are automatically sent (by the web browser) to the CDN with no additional work needed
IAM gets only a tiny mention as not present, therefore making R2 simpler. But also... IAM is missing and a lot of interesting use cases are not possible there. No access by path, no 2fa enforcing, no easy SSO management, no blast radius limits - just would you like a token which can write a file, but also delete everything? This is also annoying for their zone management for the same reason.
My experience: I put parquet files on R2, but HTTP Range requests were failing. 50% of the time it would work, and 50% of the time it would return all of the content and not the subset requested. That’s a nightmare to debug, given that software expects it to work consistently or not work at all.
Seems like a bug. Had to crawl through documentation to find out the only support is on Discord (??), so I had to sign up.
Go through some more hoops and eventually get to a channel where I received a prompt reply: it’s not an R2 issue, it’s “expected behaviour due to an issue with “the CDN service”.
I mean, sure. On a technical level. But I shoved some data into your service and basic standard HTTP semantics where intermittently not respected: that’s a bug in your service, even if the root cause is another team.
None of this is documented anywhere, even if it is “expected”. Searching for [1] “r2 http range” shows I’m not the only one surprised
Not impressed, especially as R2 seems ideal for serving Parquet data for small projects. This and the janky UI plus weird restrictions makes the entire product feel distinctly half finished and not a serious competitor.
Of course not, and it’s completely correct behaviour: if a server advertises it supports Range requests for a given URL, it’s expected to support it. Garbage in, garbage out.
It’s not clear how you’d expect to handle a webserver trying to send you 1Gb of data after you asked for a specific 10kb range other than aborting.
"Conversely, a client MUST NOT assume that receiving an Accept-Ranges field means that future range requests will return partial responses. The content might change, the server might only support range requests at certain times or under certain conditions, or a different intermediary might process the next request." -- RFC 9110
Sure, but that’s utterly useless in practice because there is no way to handle that gracefully.
To be clear: most software does handle it, because it detects this case and aborts.
But to a user who is explicitly asking to read a parquet file without buffering the entire file into memory, there is no distinction between a server that cannot handle any range requests and a server that can occasionally handle range requests.
This is a great comparison and a great step towards pressure to improve cloud service pricing.
The magic that moves the region sounds like a dealbreaker for any use cases that aren't public, internet-facing. I use $CLOUD_PROVIDER because I can be in the same regions as customers and know the latency will (for the most part) remain consistent. Has anyone measured latencies from R2 -> AWS/GCP/Azure regions similar to this[0]?
Also does anyone know if the R2 supports the CAS operations that so many people are hyped about right now?
This really is a good article. My only issue is that it pretends that the only competition is between Cloudflare and AWS. There are several other low rent storage providers that offer an S3 compatible API. It's also worth looking at Backblaze and Wasabi, for instance. But I don't want to take anything away from this article.
Only tangentially related to the article, but I’ve never understood how R2 offers 11 9s of durability. I trust that S3 offers 11 9s because Amazon has shown, publicly, that they care a ton about designing reliable, fault tolerant, correct systems (eg Shardstore and Shuttle)
Cloudflare’s documentation just says “we offer 11 9s, same as S3”, and that’s that. It’s not that I don’t believe them but… how can a smaller organization make the same guarantees?
It implies to me that either Amazon is wasting a ton of money on their reliability work (possible) or that cloudflare’s 11 9s guarantee comes with some asterisks.
Minimally, the two examples I cited: Shardstore and Shuttle. The former is a (lightweight) formally verified key value store used by S3, and the latter is a model checker for concurrent rust code.
Amazon has an entire automated reasoning group (researchers who mostly work on formal methods) working specifically on S3.
As far as I’m aware, nobody at cloudflare is doing similar work for R2. If they are, they’re certainly not publishing!
Money might not be the bottleneck for cloudflare though, you’re totally right
I think I overstated the case a little, I definitely don’t think automated reasoning is some “secret reliability sauce” that nobody else can replicate; it does give me more confidence that Amazon takes reliability very seriously, and is less likely to ship a terrible bug that messes up my data.
Great article. Do you have throughput comparisons?
I've found r2 to be highly variable in throughput, especially with concurrent downloads. s3 feels very consistent, but I haven't measured the difference.
I do mostly CRUD apps with Laravel and Vue. Nothing too complicated. Allows users to post stuff with images and files. I’ve moved ALL of my files from S3 to R2 in the past 2 years. It’s been slow as any migrations are but painless.
But most importantly for an indie dev like me the cost became $0.
At one company we were uploading videos to S3 and finding a lot of errors or stalls in the process. That led to evaluating GCP and Azure. I found that Azure had the most consistent (least variance) in upload durations and better pricing. We ended up using GCP for other reasons like resumable uploads (IIRC). AWS now supports appending to S3 objects which might have worked to avoid upload stalls. CloudFront for us at the time was overpriced.
To measure performance the author looked at latency, but most S3 workloads are throughput oriented. The magic of S3 is that it's cheap because it's built on spinning HDDs, which are slow and unreliable individually, but when you have millions of them, you can mask the tail and deliver multi TBs/sec of throughput.
It's misleading to look at S3 as a CDN. It's fine for that, but it's real strength is backing the world's data lakes and cloud data warehouses. Those workloads have a lot of data that's often cold, but S3 can deliver massive throughout when you need it. R2 can't do that, and as far as I can tell, isn't trying to.
Yeah, I'd be interested in the bandwidth as well. Can R2 saturate 10/25/50 gigabit links? Can it do so with single requests, or if not, how many parallel requests does that require?
That's unrelated to the performance of (for instance) the R2 storage layer. All the bandwidth in the world won't help you if you're blocked on storage. It isn't clear whether the overall performance of R2 is capable of saturating user bandwidth, or whether it'll be blocked on something.
S3 can't saturate user bandwidth unless you make many parallel requests. I'd be (pleasantly) surprised if R2 can.
I'm confused, I assumed we were talking about the network layer.
If we are talking about storage, well, SATA can't give you more than ~5Gbps so I guess the answer is no? But also no one else can do it, unless they're using super exotic HDD tech (hint: they're not, it's actually the opposite).
What a weird thing to argue about, btw, literally everybody is running a network layer on top of storage that lets you have much higher throughput. When one talks about R2/S3 throughput no one (on my circle, ofc.) would think we are referring to the speed of their HDDs, lmao. But it's nice to see this, it's always amusing to stumble upon people with a wildly different point of view on things.
We're talking about the user-visible behavior. You argued that because Cloudflare's CDN has an obscene amount of bandwidth, R2 will be able to saturate user bandwidth; that doesn't follow, hence my counterpoint that it could be bottlenecked on storage rather than network. The question at hand is what performance R2 offers, and that hasn't been answered.
There are any number of ways they could implement R2 that would allow it to run at full wire speed, but S3 doesn't run at full wire speed by default (unless you make many parallel requests) and I'd be surprised if R2 does.
I have some large files stored in R2 and a 50Gbps interface to the world.
curl to Linode's speed test is ~200MB/sec.
curl to R2 is also ~200MB/sec.
I'm only getting 1Gbps but given that Linode's speed is pretty much the same I would think the bottleneck is somewhere else. Dually, R2 gives you at least 1Gbps.
No, most people aren’t interested in subcomponent performance, just in total performance. A trivial example is that even a 4-striped U2 NVMe disk array exported over Ethernet can deliver a lot more data than 5 Gbps and store mucho TiB.
Cloudflare's paid DDoS protection product being able to soak up insane L3/4 DDoS attacks doesn't answer the question as to whether or not the specific product, R2 from Cloudflare which has free egress is able to saturate a pipe.
Cloudflare has the network to do that, but they charge money to do so with their other offerings, so why would they give that to you for free? R2 is not a CDN.
lol I think the only reason you're being downvoted is because the common belief at HN is, "of course marketing is lying and/or doesn't know what they're talking about."
I didn’t downvote but s3 does have low latency offerings (express). Which has reasonable latency compared to EFS iirc. I’d be shocked if it was as popular as the other higher latency s3 tiers though.
Very good article and interesting read. I did want to clarify some misconceptions I noted while reading (working from memory so hopefully I don’t get anything wrong myself).
> As explained here, Durable Objects are single threaded and thus limited by nature in the throughput they can offer.
R2 bucket operations do not use single threaded durable objects but did a one off thing just for R2 to let it run multiple instances even. That’s why the limits were lifted in the open beta.
> they mentioned that each zone's assets are sharded across multiple R2 buckets to distribute load which may indicated that a single R2 bucket was not able to handle the load for user-facing traffic. Things may have improve since thought.
I would not use this as general advice. Cache Reserve was architected to serve an absurd amount of traffic that almost no customer or application will see. If you’re having that much traffic I’d expect you to be an ENT customer working with their solutions engineers to design your application.
> First, R2 is not 100% compatible with the S3 API. One notable missing feature are data-integrity checks with SHA256 checksums.
This doesn’t sound right. I distinctly remember when this was implemented for uploading objects. Sha-1 and sha-256 should be supported (don’t remember about crc). For some reason it’s missing from the docs though. The trailer version isn’t supported and likely won’t be for a while though for technical reasons (the workers platform doesn’t support http trailers as it uses http1 internally). Overall compatibility should be pretty decent.
The section on “The problem with cross-datacenter traffic” seems to be flawed assumptions rather than data driven. Their own graphs only show that while public buckets have some occasional weird spikes it’s pretty constantly the same performance while the S3 API has more spikeness and time of day variability is much more muted than the CPU variability. Same with the assumption on bandwidth or other limitations of data centers. The more likely explanation would be the S3 auth layer and the time of day variability experienced matches more closely with how that layer works. I don’t know enough of the particulars of this author’s zones to hypothesize but the s3 with layer was always challenging from a perf perspective.
> This is really, really, really annoying. For example you know that all your compute instances are in Paris, and you know that Cloudflare has a big datacenter in Paris, so you want your bucket to be in Paris, but you can't. If you are unlucky when creating your bucket, it will be placed in Warsaw or some other place far away and you will have huge latencies for every request.
I understand the frustration but there are very good technical and UX reasons this wasn’t done. For example while you may think that “Paris datacenter” is well defined, it isn’t for R2 because unlike S3 your metadata is stored regionally across multiple data centers whereas S3 if I recall correctly uses what they call a region which is a single location broken up into multiple availability zones which are basically isolated power and connectivity domains. This is an availability tradeoff - us-east-1 will never go offline on Cloudflare because it just doesn’t exist - the location hint is the size of the availability region. This is done at both the metadata and storage layers too. The location hint should definitely be followed when you create the bucket but maybe there are bugs or other issues.
As others noted throughput data would also have been interesting.
>Generally, R2's user experience is way better and simpler than S3. As always with AWS, you need 5 certifications and 3 months to securely deploy a bucket.
Yeah this is really annoying. That and replication to multiple regions is the reason we're not using R2.
Global replication was a feature announced in 2021 but still hasn't happened:
> R2 will replicate data across multiple regions and support jurisdictional restrictions, giving businesses the ability to control where their data is stored to meet their local and global needs.
Whenever a new incumbent gets on the scene offering the same thing as some entrenched leader only better, faster, and cheaper, the standard response is "Yeah but it's less reliable. This may be fine for startups but if you're <enterprise|government|military|medical|etc>, you gotta stick with the tried tested and true <leader>"
You see this in almost every discussion of Cloudflare, which seems to be rapidly rebuilding a full cloud, in direct competition with AWS specifically. (I guess it wants to be evaluated as a fellow leader, not an also-ran like GCP/Azure fighting for 2nd place)
The thing is, all the points are right. Cloudflare IS different - by using exclusively edge networks and tying everything to CDNs, it's both a strength and a weakness. There's dozens of reasons to be critical of them and dozens more to explain why you'd trust AWS more.
But I can't help but wonder that surely the same happened (i wasn't on here, or really tech-aware enough) when S3 and EC2 came on the scene. I'm sure everyone said it was unreliable, uncertain, and had dozens of reasons why people should stick with (I can only presume - VMWare, IBM, Oracle, etc?)
This is all a shallow observation though.
Here's my real question, though. How does one go deeper and evaluate what is real disruption and what is fluff. Does Cloudflare have something that's unique and different that demonstrates a new world for cloud services I can't even imagine right now, as AWS did before it. Or does AWS have a durable advantage and benefits that will allow it to keep being #1 indefinitely? (GCP and Azure, as I see it, are trying to compete on specific slices of merit. GCP is all-in on 'portability', that's why they came up with Kubernetes to devalue the idea of any one public cloud, and make workloads cross-platform across all clouds and on-prem. Azure seems to be competitive because of Microsoft's otherwise vertical integration with business/windows/office, and now AI services).
Cloudflare is the only one that seems to show up over and over again and say "hey you know that thing that you think is the best cloud service? We made it cheaper, faster, and with nicer developer experience." That feels really hard to ignore. But also seems really easy to market only-semi-honestly by hand-waving past the hard stuff at scale.
Cloudflares architecture is driven purely by their history of being a CDN and trying to find new product lines to generate new revenue streams to keep share price up.
You wouldn't build a cloud from scratch in this way.
While Cloudflare may reach out and say 'you should be on enterprise' when that happens on R2, the fact they also handle DDoS and similar attacks as part of their offering means the likelihood of success is much lower (as is the final bill).
reply