KB 2803: RETRIEVE NUMBER OF USERS PASSING THROUGH OLFEO PROXY

Published November 28, 2023

Obtain information on your proxy's usage, such as the number of users over the last 30 days and the names of these users.

BACKGROUND

It is not possible to obtain the number of users having passed through the Olfeo proxy over a given period of time directly in the olfeo Statistics section. A limit of 300 entities has been set for performance reasons.

However, this information can be obtained simply by querying the Elasticsearch database that houses olfeo's browsing logs.

STEPS

To obtain this information, please ssh to the Olfeo proxy (the master if you have a cluster).

Once connected, enter the chrooted environment if you're on a virtual machine with the command:

chroot /opt/olfeo/chroot

You can then launch a query to Elasticsearch, which will return the number of unique 30-day users in the database:

curl -d '{"size":0,"aggs":{"primary":{"terms":{"field":"user","size":1000000},"aggs":{"count":{"sum":{"field":"hits"}}}}},"query":{"bool":{"must":[{"range":{"timerange":{"lte":"now/d","gte":"now/d-30d"}}}]}}}' -H "Content-Type: application/json" -X POST "http://127.0.0.1:9200/url_*/_search" | python3 -c 'import json,sys;print("Nombre utilisateurs: ", len(json.load(sys.stdin)["aggregations"]["primary"]["buckets"]))'

Get user list

To obtain a detailed list of users, run the following command:

curl -H "Content-Type: application/json" -d '{"aggs":{"names": {"terms": {"field": "name.keyword", "size":"5000"}}}, "query":{"bool":{"must":[{"range":{"timerange":{"lte":"now/d","gte":"now/d-30d"}}}]}}}' 'http://127.0.0.1:9200/url_*/_search?pretty=true&size=0'  | python3 -c 'import json,sys;[print(e["key"]) for e in json.load(sys.stdin)["aggregations"]["names"]["buckets"]]'

Redirect user list to a file

If you wish to redirect this list of users to a file:

curl -H "Content-Type: application/json" -d '{"aggs":{"names": {"terms": {"field": "name.keyword", "size":"5000"}}}, "query":{"bool":{"must":[{"range":{"timerange":{"lte":"now/d","gte":"now/d-30d"}}}]}}}' 'http://127.0.0.1:9200/url_*/_search?pretty=true&size=0'  | python3 -c 'import json,sys;[print(e["key"]) for e in json.load(sys.stdin)["aggregations"]["names"]["buckets"]]' > /tmp/liste_utilisateurs.txt

Inspect user list interactively

Finally, to inspect this list interactively, type:

curl -H "Content-Type: application/json" -d '{"aggs":{"names": {"terms": {"field": "name.keyword", "size":"5000"}}}, "query":{"bool":{"must":[{"range":{"timerange":{"lte":"now/d","gte":"now/d-30d"}}}]}}}' 'http://127.0.0.1:9200/url_*/_search?pretty=true&size=0'  | python3 -c 'import json,sys;[print(e["key"]) for e in json.load(sys.stdin)["aggregations"]["names"]["buckets"]]' | less

VALIDATION

You can then obtain more detailed information (domains visited, browsing volume) on a given user via the olfeo Webadmin statistics engine.