Add documentation.

Add doc/
Add doc/README
Add doc/INSTALL

Signed-off-by: Thomas Hochstein <thh@inter.net>
This commit is contained in:
Thomas Hochstein 2010-09-17 21:16:51 +02:00
parent 13c5a175ef
commit 610b5ef492
2 changed files with 222 additions and 0 deletions

128
doc/INSTALL Normal file
View file

@ -0,0 +1,128 @@
NewsStats 0.1 (c) 2010 Thomas Hochstein <thh@inter.net>
NewsStats is a software package for gathering statistical data live
from a Usenet feed and subsequent examination.
This script package is free software; you can redistribute it and/or
modify it under the terms of the GNU Public License as published by
the Free Software Foundation.
---------------------------------------------------------------------
INSTALLATION INSTRUCTIONS
1) Install the scripts
* Download the current version of NewsStats from
<http://th-h.de/download/scripts.php>.
* Untar it into a directory of your choice:
# tar -xzf newsstats-nn.tar.gz
Scripts in this path should be executable by the news user.
2) Configuration
* Copy the sample configuration file newsstats.conf.sample to
newsstats.conf and modify it for your purposes:
# cp newsstats.conf.sample newsstats.conf
# vim newsstats.conf
a) Mandatory configuration options
* DBDriver = mysql
Database driver used; currently only mysql is supported.
* DBHost = localhost
The host your mysql server is running on.
* DBUser =
The username to connect to the database server.
* DBPw =
Matching password for your username.
* DBDatabase = newsstats
Database name.
NewsStats will use those credentials to connect to your mysql
installation.
* DBTableRaw = raw_de
Table holding raw statistical data.
* DBTableGrps = groups_de
Table holding data on postings per group.
b) Optional configuration options
* TLH = de
Limit examination to that top-level hierarchy.
3) Database (mysql) setup
* Setup your database server with a username, password and
database matching the NewsStats configuration (see 2 a).
* Start the installation script:
# install/install.pl
It will setup the necessary database tables and display some
information on the next steps.
4) Feed (INN) setup
You have to setup an INN feed to feedlog.pl.
* Edit your 'newsfeeds' file and insert something like
## gather statistics for NewsStats
newsstats!
:!*,de.*
:Tc,WmtfbsPNH,Ac:/path/to/feedlog.pl
* You should only feed that hierarchy (those hierarchies ...) to
feedlog.pl you'll want to cover with your statistical
examination. It may be a good idea to setup different feeds (to
different databases ...) for different hierarchies.
* Please double check that your path to feedlog.pl is correct and
feedlog.pl can be executed by the news user
* Check your 'newsfeeds' syntax:
# ctlinnd checkfile
* Reload 'newsfeeds':
# ctlinnd reload newsfeeds 'Adding newsstats! feed'
* Watch your 'news.notice' and 'errlog' files:
# tail -f /var/log/news/news.notice
...
# tail -f /var/log/news/errlog
Everything should be going smoothly now.
* If INN is spewing error messages to 'errlog' or reporting
continous respaws of feedlog.pl to 'news.notice', stop your feed:
# ctlinnd drop 'newsstats!'
and investigate. 'errlog' may be helpful here.
* You can restart the feed with
# ctlinnd begin 'newsstats!'
later.
You should be done now.
Just have a look at your raw data (DBTableRaw). It should now start to
fill up.

94
doc/README Normal file
View file

@ -0,0 +1,94 @@
NewsStats 0.1 (c) 2010 Thomas Hochstein <thh@inter.net>
NewsStats is a software package for gathering statistical data live
from a Usenet feed and subsequent examination.
This script package is free software; you can redistribute it and/or
modify it under the terms of the GNU Public License as published by
the Free Software Foundation.
---------------------------------------------------------------------
What's that?
There's a multitude of tools for the statistical examination of
newsgroups: number of postings month or per person, longest
threads, and so on (see <http://th-h.de/infos/usenet/stats.php>
[German language] for an incomplete list). Most of them use a per-
newsgroup approach while NewsStats is hierarchy oriented.
NewsStats will accumulate data from a live INN feed, allowing you
to process the saved information later on.
Workflow
NewsStats saves overview data and complete headers of (all)
incoming postings to a (MySQL) database in real time.
That raw data will be regularly - e.g. monthly - processed to a
second set of database tables each dedicated to a certain
statistical aspect, e.g. number of postings per group per month.
Several kinds of reports can then be generated from those result
tables.
Prerequisites
NewsStats is written in Perl (5.8.x and above) and makes use of a
MySQL database, so you'll need Perl, some modules, mysql and, of
course, an INN.
* Perl 5.8.x with standard modules
- Cwd
- File::Basename
- Sys::Syslog
* Perl modules form CPAN
- Cofing::Auto
- Date::Format
- DBI
* mysql 5.0.x
* working installation of INN
Installation instructions
See INSTALL.
Getting Started
'feedlog.pl' will continuously feed raw data to your raw data
table. See the feedlog.pl man page for more information.
You can process that data via 'gatherstats.pl'; currently only the
tabulation of postings per group per month is supported. More to
come. See the gatherstats.pl man page for more information.
Report generation is handled by specialised scripts for each
report type. Currently only reports on the number of postings per
group per month are supported; you can use 'groupstats.pl' for
that. See the groupstats.pl man page for more information.
Reporting Bugs
You can report bugs or feature requests to the author by using the
bug tracker at <http://bugs.th-h.de/>.
More Information
This program is maintained using the Git version control system.
You may clone <git://code.th-h.de/usenet/newsstats.git> to check
out the current development tree or browse it on the web via
<http://code.th-h.de/?p=usenet/newsstats.git>.
Related projects
<http://usenet.dex.de/> is a site were data gathered via NewsStats
is used for a graphical presentation of activity in the de.*
Usenet hierarchy over the years.
Author
Thomas Hochstein <thh@inter.net>
<http://th-h.de/>