I received an email today with some questions about diskprices.com. I get some of these pretty frequently, so I thought it would be good to share the responses more widely. Shared with permission from Mr. Doyle.

On Fri, Sep 27, 2024 at 12:05 PM Luke Doyle wrote:

Anyway diskprices.com is an amazing website, very easy to use, no BS, and the monetization strategy is straightforward - just affiliate links. No annoying ads or whatever.

Technically, the whole site is an ad. It’s just one that assumes the consumer is not an idiot.

As a fairly seasoned software engineer (10 years working in tech in SF), I started looking into how it works. No JS, barely any CSS, all server side-rendered.

That’s not strictly true… All of the filtering is done client side. The only query string parameter the server looks at is locale.

I was wondering if you don’t mind sharing - how is the backend put together?

Django on SQLite on EC2. I try to keep things simple.

You must be scraping amazon somehow, do they ever block you?

If you sign up for the Affiliate Program you get access to the Product Advertising API after you’ve referred 3 orders in a 90 day period. There are a lot of restrictions on what you can do with the data and how you display it, so read the Operating Agreement and PA-API Terms before writing any code. The API itself is annoying to work with, you’ll want some sort of task queue to perform requests within the rate limits and do retries for failures, which are rare, but do happen.

Also getting the info from all the product listings must be some HTML parsing / regexes. Is it a mess or straightforward?

Most of what’s displayed on diskprices is parsed out of the product title, ignoring the rest of the metadata that PA-API gives me. It is a mess of regexes and conditionals. It’s not really structured enough for a formal grammar and it’s not really natural language either. An LLM might be able to make sense of the data, but that’s going to blow up your compute cost and Amazon’s recently added some new terms to the PA-API that forbid training AI with their data, so I’m not sure that’d be wise anyway.

How do you normalize all the data?

After parsing metadata out of the product info, it goes into the database where I manually review and fix incorrect listings. I have a dashboard of “new products” that I keep an eye on. I can do most edits from my phone, so it gives me something more productive to do than scrolling reddit when I’m waiting for my order at the coffee shop or whatever.

Does the structure of the page often break and you need to redo the parser?

Nope. The frontend code is just one big SQL query and a template. It has changed very little in the last 5+ years.

How do you deal with the product variants (like multiple listings within a page)? How to dedupe / prune / remove stale listings?

Amazon’s API for GetVariations is a headache because it has to be called for each ParentASIN separately. You can’t do this for all products without hitting the rate limit, so I just have a flag I can set that enables that query for a specific product. Each variation gets a separate record in the database, so deduplication and whatnot is pretty straightforward from there.

On the operational side - how much traffic do you get?

I don’t really want to get into specific numbers, but it’s sorta irrelevant operationally. Most requests are served out of Cloudflare’s cache. The only requests that hit the server are cache refreshes and a POST to log a page view for analytics purposes, which is fairly lightweight.

Do the affiliate links return enough to cover server costs, your rent / mortgage etc?

By a large margin, yes, it makes about the same as an entry level IT job. It took several years to grow to that point. I don’t really try to maximize that though, as I have other sources of income.

I know I’m being really nosy for a random stranger - again feel free to answer as much / as little as you want.

For what it’s worth, I’m not going to share any of this with anyone - you seem like a solo guy operating a successful, straightforward website and I sorta want to emulate that myself. I wish more websites like diskprices.com existed (especially with the enshittification of the internet). I’m wondering how you keep it lean, and if it’s not wildly complex I might try to build / host something like this myself as a side-gig, but only if the ROI on the affiliate links is worth it for a one-person operation if / when it gains traction. For instance, I wish gpuprices.com or pcpowersupplyprices.com existed. I know pcpartpicker.com exists, but it’s way more complex than what I personally want to deal with.

I’m not going to tell you that it’s definitely a good idea or going to be profitable. It only exists by the grace of Amazon. If they change their policies or decide they don’t like what you’re doing, that’s game over. shucks.top is a good example of this… They pushed the limits of the Operating Agreement too far and got their affiliate account killed. eBay seems to be more flexible/relaxed about affiliate marketing, but the marketplace is very different.

I’d suggest looking outside of the tech/computer parts product categories, those are already pretty well served. Ideally you want something that is a commodity (eg. products from different vendors are interchangable) that’s also expensive enough for people to go to the effort of comparison shopping.

I keep a list (included below) of all of the diskprices clones or similar sites that I know about, so you can see what’s already been done (and what’s been done, but poorly).

Similar sites

I don’t endorse any of these sites or have any affiliation with their operators. I just think they’re neat.

Memory

https://ramstickprices.com/
https://ramprices.org/

Monitor

https://tvpricesindex.com/
https://www.screenprices.info/
https://www.monitorprices.org/
https://portablemonitorprices.wordpress.com/

GPU

https://gpuprices.us/
https://bestvaluegpu.com/
https://gpu-prices.com/
https://gpupricecompare.com/
https://gpuquicklist.com/
https://graphicscardprice.com/
https://www.azgpuprices.com/
http://www.zhusd.com/gpu

CPU

https://cpuscout.com/

Other

https://lipstickprices.com/
https://traderjoesprices.com/
https://smartsolarpricing.com/
https://www.findaniphone.com/
https://lens-camera-prices.com/us/lenses
https://micprices.net/
https://batterypackprices.com/
https://catfoodprices.com/
https://www.laptop-prices.org/
https://www.priceperprotein.com/
https://lowcostminipcs.com/
https://printerprices.info/
https://psuprices.pages.dev/
https://ethernetcableprices.com/