Track the trackers. Crawl for german website using google fonts (and a bit more).
Go to file
Lio Novelli 35d4439d28 Fix return errors. 2022-02-06 19:10:30 +01:00
ger_gfonts Fix return errors. 2022-02-06 19:10:30 +01:00
.gitignore Fix return errors. 2022-02-06 19:10:30 +01:00
README.md Fix return errors. 2022-02-06 19:10:30 +01:00
requirements.txt Add logging and url validations 2022-02-02 19:57:31 +01:00

README.md

German google fonts pages

A spider that's looking for german page with google fonts hosted on google.

Based on: https://docs.scrapy.org/en/latest/intro/tutorial.html

Usage

pip3 install -e .
scrapy startproject ger_gfonts
cd ger_gfonts
scrapy crawl gfonts -O gfonts.json

TODO

!Implement a crawling spider: https://doc.scrapy.org/en/latest/topics/spiders.html#crawlspider

Start checking for google analytics for all eu websites.

meta pixel

<!-- Meta Pixel Code -->
<script>
!function(f,b,e,v,n,t,s)
{if(f.fbq)return;n=f.fbq=function(){n.callMethod?
n.callMethod.apply(n,arguments):n.queue.push(arguments)};
if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0';
n.queue=[];t=b.createElement(e);t.async=!0;
t.src=v;s=b.getElementsByTagName(e)[0];
s.parentNode.insertBefore(t,s)}(window, document,'script',
'https://connect.facebook.net/en_US/fbevents.js');
fbq('init', '898263220867925');
fbq('track', 'PageView');
</script>
<noscript><img height="1" width="1" style="display:none"
src="https://www.facebook.com/tr?id=898263220867925&ev=PageView&noscript=1"
/></noscript>
<!-- End Meta Pixel Code -->

IDEAS

Make it into browserextension that would notify you.

Checking website origin:

https://ipinfo.io/