Kusto Tor Exit Nodes Historic Table

Description


This data represents a historic dataset of Tor exit nodes.

Source


Taken from the publically available resource on the Tor Projects own site https://check.torproject.org/torbulkexitlist

Why should I use this data?


It’s a high fidelity data set given it comes from the Tor Project themselves to identofy Tor usage. It is useful to find Tor nodes within existing datasets and represents a snapshot of the Tor environment on a specified date. This data set started collecting data 1 May 2024.

Disclaimer

Please note dependant on how this dataset evolves (eg data set creation processing size and Kusto ingestion processing) it may be changed to only include a specific time window (eg last 365 days of data).

Updates


Daily at around 0300UTC although source data is updated more regularly.

Schema


Column Name Data Type Notes
IP string The given IP address
ActiveDates string Comma separated date values ‘yyyy-MM-dd’
Source string Always torproject.org

Base Kusto Table


externaldata (IP:string, ActiveDates:string, Source:string) ['https://firewalliplists.gypthecat.com/lists/kusto/kusto-tor-exit-historic.json.zip'] with (ignoreFirstRecord=true)

Base Kusto Function


let TorExitNodes = externaldata (IP:string, ActiveDates:string, Source:string) ['https://firewalliplists.gypthecat.com/lists/kusto/kusto-tor-exit-historic.json.zip'] with (ignoreFirstRecord=true)

Self Contained Kusto


// Starting from 1 May 2024 how many days has a Tor Exit Node been available in each country?
let TorExitNodesHistoric = externaldata(IP:string, ActiveDates:string, Source:string) ['https://firewalliplists.gypthecat.com/lists/kusto/kusto-tor-exit-historic.json.zip'] with(format="multijson"); 
TorExitNodesHistoric 
| extend ActiveDates = split(ActiveDates, ',') 
| extend Country = tostring(geo_info_from_ip_address(IP)['country'])
| summarize ActiveDays = array_length(make_set(ActiveDates)) by Country
| order by ActiveDays desc
// How many Tor exit nodes which were online at the beginning of the data set are online today?
let TorExitNodesHistoric = externaldata(IP:string, ActiveDates:string, Source:string) ['https://firewalliplists.gypthecat.com/lists/kusto/kusto-tor-exit-historic.json.zip'] with(format="multijson");
let EarliestDate = toscalar(TorExitNodesHistoric
| extend ActiveDates = split(ActiveDates, ',')
| mv-expand ActiveDate = ActiveDates to typeof(string)
| extend ActiveDate = todatetime(ActiveDate)
| summarize tostring(format_datetime(min(ActiveDate), 'yyyy-MM-dd')));
TorExitNodesHistoric
| where ActiveDates has (EarliestDate) and ActiveDates has tostring(format_datetime(now(), 'yyyy-MM-dd'))
| count
// How many exit nodes per day have there been?
let TorExitNodesHistoric = externaldata(IP:string, ActiveDates:string, Source:string) ['https://firewalliplists.gypthecat.com/lists/kusto/kusto-tor-exit-historic.json.zip'] with(format="multijson"); 
TorExitNodesHistoric 
| extend ActiveDates = split(ActiveDates, ',') 
| mv-expand ActiveDate = ActiveDates to typeof(string) 
| extend ActiveDate = todatetime(ActiveDate)
| summarize count() by ActiveDate
| render timechart
// How many days have exit nodes been online for?
let TorExitNodesHistoric = externaldata(IP:string, ActiveDates:string, Source:string) ['https://firewalliplists.gypthecat.com/lists/kusto/kusto-tor-exit-historic.json.zip'] with(format="multijson"); 
TorExitNodesHistoric 
| extend ActiveDates = split(ActiveDates, ',') 
| summarize count() by DaysActive = tostring(array_length(ActiveDates))
| render piechart

MDE Example


Coming soon.

Sentinel & Azure Log Analytics Example


Coming soon.