Easiest way is to download all and let something like dupeGuru do the rest
Afaik google makes that extremely annoying and they strip your images metadata when getting them from takeout so keep that in mind
Also, for your own good create a backup before deduplicating, just in case you do something wrong (this also let’s you experiment with your duplicate file finder of choice without having to be scared about fucking up)
They can try to block crawlers all they want
They will not succeed without restricting access to Reddit to an unusable degree, since crawlers can be coded to imitate real users close enough. Combine that with enough proxies and they can’t do jack shit
Also you could get arround the Referer header quite easily via redirects (unless Reddit went ahead and used a Whitelist for those, which again would be a very stupid decision) and some more methods