newsstats/doc/README
Thomas Hochstein c4360c074f Update documentation.
- Fix clientstats doc (copied from hoststats).
- Add some more examples ro README.

Signed-off-by: Thomas Hochstein <thh@thh.name>
2025-06-01 16:39:25 +02:00

142 lines
4.4 KiB
Plaintext

NewsStats (c) 2010-2013, 2025 Thomas Hochstein <thh@thh.name>
NewsStats is a software package for gathering statistical data live
from a Usenet feed and subsequent analysis.
This package is free software; you can redistribute it and/or modify
it under the terms of the GNU Public License as published by the Free
Software Foundation.
---------------------------------------------------------------------
What's that?
There's a multitude of tools to create statistics about newsgroup
usage: number of postings per month or per person, longest threads,
and so on (see <https://th-h.de/net/usenet/stats/> [German language]
for an incomplete list). Most of them use a per-newsgroup approach
while NewsStats is hierarchy oriented.
NewsStats will accumulate data from a live INN feed, allowing you
to process the saved information later on.
Workflow
NewsStats saves overview data and complete headers of (all)
incoming postings to a (MySQL) database in real time.
That raw data will be regularly - e.g. monthly - processed to a
second set of database tables each dedicated to a certain
statistical aspect, e.g. number of postings per group and month.
Several kinds of reports can then be generated from those result
tables.
Prerequisites
NewsStats is written in Perl (5.8.x and above) and makes use of a
MySQL database, so you will need Perl, some modules, mysql and, of
course, INN.
* Perl 5.8.x with standard modules
- Cwd
- Encode
- File::Basename
- Getopt::Long
- Sys::Syslog
* Perl modules from CPAN
- Config::Auto
- Date::Format
- DBI
* mysql 5.0.x
* a working installation of INN
Installation instructions
See INSTALL.
Documentation is in /doc, configuration in /etc, the NewsStats
module in /lib and most scripts in /bin, while /contrib has some
sample scripts that may have to be adjusted to work in your
configuration.
Getting Started
'feedlog.pl' will continuously feed raw data to your raw data
table. See the feedlog.pl man page for more information.
You can process that data via 'gatherstats.pl'; currently the
tabulation of postings per group, injection server and posting
agent (newsreader) per month is supported. See the gatherstats.pl
man page for more information.
Example:
bin/gatherstats.pl
will parse raw data from the last month and save the results in
tables for postings per group, server and client, respectively.
Report generation is handled by specialised scripts for each
report type: 'groupstats.pl' for postings per group
(s), 'hoststats.pl' for postings per injection server
(s) and 'clientstats.pl' for postings per posting agent. See the
groupstats.pl, hoststats.pl and clientstats.pl man pages for more
information.
Example:
bin/groupstats.pl -o postings-desc
bin/hoststats.pl -o postings-desc
bin/clientstats.pl -o postings-desc -v
will show reports for postings per group, per injection server and
per client (with detailed client versions) for the last month,
using the result tables filled by gatherstats.
To post those reports to Usenet, change postingstats.pl according
to your needs (sender, newsgroups and other headers, translation
of table headers and text templates) and display a test posting
by piping report data into postingstats.pl:
bin/groupstats.pl --nocomments -s -f dump | bin/postingstats.pl
If the result is to your liking, add a pipe to a inews
implementation.
Example:
bin/groupstats.pl --nocomments -s -f dump | bin/postingstats.pl | contrib/tinews.pl -X
More information
See the man pages for 'gatherstats' and the report generating
scripts.
Reporting Bugs
You can report bugs or feature requests to the author using the
issue tracker at <https://code.virtcomm.de/thh/newsstats/issues>.
Please have a look at the TODO list before suggesting
improvements.
Development
This package is maintained using the Git version control system at
<https://code.virtcomm.de/thh/newsstats/>.
Related projects
<http://usenet.dex.de/> is a site were data gathered via NewsStats
is used for a graphical presentation of activity in the de.*
Usenet hierarchy over the years (since 1992).
Author
Thomas Hochstein <thh@thh.name>
<https://th-h.de/>