Compare commits

...

22 commits
master ... next

Author SHA1 Message Date
Thomas Hochstein d49ef1d302 Update POD documentation (URLs) and author's address.
Signed-off-by: Thomas Hochstein <thh@thh.name>
2024-05-27 01:04:03 +02:00
Thomas Hochstein 09b45fc369 Update doc/README
Signed-off-by: Thomas Hochstein <thh@thh.name>
2024-05-27 00:55:55 +02:00
Thomas Hochstein ee29be18c8 Update TODO from bugs.th-h.de
Signed-off-by: Thomas Hochstein <thh@thh.name>
2024-05-27 00:15:34 +02:00
Thomas Hochstein cd6f153a9e Add README.
Signed-off-by: Thomas Hochstein <thh@thh.name>
2024-05-26 00:08:34 +02:00
Thomas Hochstein b5ef572664 Accept an upper/lower boundary of 0 (zero).
The code checks if a boundary is set by looking
for a TRUE value, but 0 is FALSE. It has to check
whether the variable is set, i.e. defined(),
instead.

Fixes #56.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2018-01-01 21:37:43 +01:00
Thomas Hochstein 91c674c4fe Merge branch 'thh-small-changes' into next
* thh-small-changes:
  Fix displayed path in install.
  Update INSTALL documentation.
  Fix documentation relating to conffile location.
  Fix --conffile in POD.
  Bump version numbers.
  Fix forgotten dates.
  Fix ea91003a99.
2018-01-01 16:58:25 +01:00
Thomas Hochstein fd0717a15c Fix displayed path in install.
install.pl will display a sample newsfeeds entry.
Adapt the path to the changes in
2ad99c20bc.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 11:45:33 +02:00
Thomas Hochstein b3b170c357 Update INSTALL documentation.
Configuration files now reside in etc/.

This was an oversight from commit
2ad99c20bc.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 11:45:31 +02:00
Thomas Hochstein 44c197097b Fix documentation relating to conffile location.
CONFIGURATION section talks about newsstats.conf being
in the same directory which is not true any more since
2ad99c20bc.

It also didn't mention the --conffile option which was
added in 23ab67a099.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 11:43:13 +02:00
Thomas Hochstein e39d4207a6 Fix --conffile in POD.
Change '--conffile' to 'B<--conffile>'.
The wrong format was added to documentation
in commit
23ab67a099.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 11:30:28 +02:00
Thomas Hochstein 24d2011f32 Bump version numbers.
All scripts - and the package - have been
restructured in commit
2ad99c20bc,
but version numbers didn't change accordingly.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 11:29:50 +02:00
Thomas Hochstein 2871792120 Fix forgotten dates.
Some dates were not bumped when releasing v 0.01
in 07c0b2589a.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 11:29:21 +02:00
Thomas Hochstein 22d3d70a72 Fix ea91003a99.
Commit ea91003a99
was broken and did not check for undefined
variables.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-04 10:53:34 +02:00
Thomas Hochstein 599fefbf6a Merge branch 'thh-bug51' into next
* thh-bug51:
  One more default sorting order ("grouping").
2013-09-03 22:25:23 +02:00
Thomas Hochstein 7624accb6e Merge branch 'thh-small-changes' into next
* thh-small-changes:
  Small comment fixes.
  --sums is not compatible with --checkgroups.
2013-09-03 22:25:13 +02:00
Thomas Hochstein 8dc6823e98 Small comment fixes.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 17:12:09 +02:00
Thomas Hochstein 17ef44085f --sums is not compatible with --checkgroups.
'Virtual' .ALL groups will never be present in
a checkgroups file, and we can't use them anyway
as they would contain postings from groups that
are filtered out by --checkgroups.

Add a warning, put a note in the documentation.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 15:10:07 +02:00
Thomas Hochstein ea91003a99 One more default sorting order ("grouping").
If --group-by is not set, output will be grouped
by month by default (as long as --boundary is
not set to 'level' or 'average', where grouping
by newsgroup is default).

Now we default to 'newsgroup' if just one newsgroup
is requested by --newsgroups, but more than one
month by --month.

Both defaults can be overridden.

But forced --group-by=month for --report type
'average' or 'sum' in front so defaults are
not checked.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 14:56:17 +02:00
Thomas Hochstein 1af57a5390 Merge branch 'thh-restructure' into next
* thh-restructure:
  Make configuration file configurable.
  Fix some whitespace.
  Redo directory structure.
2013-09-03 14:55:42 +02:00
Thomas Hochstein 23ab67a099 Make configuration file configurable.
Add --conffile option to all scripts to
overrride standard config file location
etc/newsstats.conf.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 10:01:20 +02:00
Thomas Hochstein dfc2b81c37 Fix some whitespace.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 10:01:18 +02:00
Thomas Hochstein 2ad99c20bc Redo directory structure.
* Move all scripts to /bin
* Move configuration to /etc
* Move NewsStats.pm to /lib
* Add new path to NewsStats.pm to all scripts
* Set $HomePath to top level directory
* Move setting of config file name to ReadConf()

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 10:01:16 +02:00
11 changed files with 191 additions and 122 deletions

2
.gitignore vendored
View file

@ -1,3 +1,3 @@
tmp/
tmp/*
newsstats.conf
etc/newsstats.conf

15
README.md Normal file
View file

@ -0,0 +1,15 @@
# NewsStats
**NewsStats** is a software package to extract live data from an INN newsfeed and generate statistics from it.
## Description
**NewsStats** stores overview data and complete headers of all incoming postings (in one or more specific Usenet hierarchies) in real time in a MySQL database. This raw dataset can then be analysed regularly, e.g. monthly, for instance in terms of postings per group and month. The analysis results will also be stored in a database which in turn can be used to generate various reports.
The software package is still under development.
It is currently used to generate the monthly statistics posted to `de.admin.news.lists` for the de.\* hierarchy.
## More information
Please see the [distribution page](https://th-h.de/net/software/newsstats/) (in German).

View file

@ -4,18 +4,19 @@
#
# This script will log headers and other data to a database
# for further analysis by parsing a feed from INN.
#
#
# It is part of the NewsStats package.
#
# Copyright (c) 2010-2013 Thomas Hochstein <thh@inter.net>
# Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
#
# It can be redistributed and/or modified under the same terms under
# It can be redistributed and/or modified under the same terms under
# which Perl itself is published.
BEGIN {
our $VERSION = "0.01";
our $VERSION = "0.02";
use File::Basename;
push(@INC, dirname($0));
# we're in .../bin, so our module is in ../lib
push(@INC, dirname($0).'/../lib');
}
use strict;
use warnings;
@ -68,14 +69,15 @@ sub PrepareDB {
################################# Main program #################################
### read commandline options
my ($OptDebug,$OptQuiet);
my ($OptDebug,$OptQuiet,$OptConfFile);
GetOptions ('d|debug!' => \$OptDebug,
'q|test!' => \$OptQuiet,
'conffile=s' => \$OptConfFile,
'h|help' => \&ShowPOD,
'V|version' => \&ShowVersion) or exit 1;
### read configuration
my %Conf = %{ReadConfig($HomePath.'/newsstats.conf')};
my %Conf = %{ReadConfig($OptConfFile)};
### init syslog
openlog($0, 'nofatal,pid', LOG_NEWS);
@ -129,7 +131,7 @@ while (<>) {
};
};
$DBQuery->finish;
warn sprintf("-----\nDay: %s\nDate: %s\nMID: %s\nTS: %s\nToken: %s\n".
"Size: %s\nPeer: %s\nPath: %s\nNewsgroups: %s\nHeaders: %s\n",
$Day, $Date, $Mid, $Timestamp, $Token, $Size, $Peer, $Path,
@ -151,7 +153,7 @@ feedlog - log data from an INN feed to a database
=head1 SYNOPSIS
B<feedlog> [B<-Vhdq>]
B<feedlog> [B<-Vhdq>] [B<--conffile> I<filename>]
=head1 REQUIREMENTS
@ -172,7 +174,8 @@ terminating would only result in a rapid respawn.
=head2 Configuration
B<feedlog> will read its configuration from F<newsstats.conf> which
should be present in the same directory via Config::Auto.
should be present in etc/ via Config::Auto or from a configuration file
submitted by the B<--conffile> option.
See L<doc/INSTALL> for an overview of possible configuration options.
@ -197,6 +200,10 @@ find that information most probably in your B<INN> F<errlog> file.
Suppress logging to syslog.
=item B<--conffile> I<filename>
Load configuration from I<filename> instead of F<newsstats.conf>.
=back
=head1 INSTALLATION
@ -218,15 +225,15 @@ See L<doc/INSTALL> for further information.
=over 4
=item F<feedlog.pl>
=item F<bin/feedlog.pl>
The script itself.
=item F<NewsStats.pm>
=item F<lib/NewsStats.pm>
Library functions for the NewsStats package.
=item F<newsstats.conf>
=item F<etc/newsstats.conf>
Runtime configuration file.
@ -235,7 +242,7 @@ Runtime configuration file.
=head1 BUGS
Please report any bugs or feature requests to the author or use the
bug tracker at L<http://bugs.th-h.de/>!
bug tracker at L<https://code.virtcomm.de/thh/newsstats/issues>!
=head1 SEE ALSO
@ -255,11 +262,11 @@ This script is part of the B<NewsStats> package.
=head1 AUTHOR
Thomas Hochstein <thh@inter.net>
Thomas Hochstein <thh@thh.name>
=head1 COPYRIGHT AND LICENSE
Copyright (c) 2010-2012 Thomas Hochstein <thh@inter.net>
Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.

View file

@ -4,18 +4,19 @@
#
# This script will gather statistical information from a database
# containing headers and other information from a INN feed.
#
#
# It is part of the NewsStats package.
#
# Copyright (c) 2010-2013 Thomas Hochstein <thh@inter.net>
# Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
#
# It can be redistributed and/or modified under the same terms under
# It can be redistributed and/or modified under the same terms under
# which Perl itself is published.
BEGIN {
our $VERSION = "0.01";
our $VERSION = "0.02";
use File::Basename;
push(@INC, dirname($0));
# we're in .../bin, so our module is in ../lib
push(@INC, dirname($0).'/../lib');
}
use strict;
use warnings;
@ -37,7 +38,7 @@ my %LegalStats;
### read commandline options
my ($OptCheckgroupsFile,$OptClientsDB,$OptDebug,$OptGroupsDB,$OptTLH,
$OptHostsDB,$OptMonth,$OptRawDB,$OptStatsType,$OptTest);
$OptHostsDB,$OptMonth,$OptRawDB,$OptStatsType,$OptTest,$OptConfFile);
GetOptions ('c|checkgroups=s' => \$OptCheckgroupsFile,
'clientsdb=s' => \$OptClientsDB,
'd|debug!' => \$OptDebug,
@ -48,11 +49,12 @@ GetOptions ('c|checkgroups=s' => \$OptCheckgroupsFile,
'rawdb=s' => \$OptRawDB,
's|stats=s' => \$OptStatsType,
't|test!' => \$OptTest,
'conffile=s' => \$OptConfFile,
'h|help' => \&ShowPOD,
'V|version' => \&ShowVersion) or exit 1;
### read configuration
my %Conf = %{ReadConfig($HomePath.'/newsstats.conf')};
my %Conf = %{ReadConfig($OptConfFile)};
### override configuration via commandline options
my %ConfOverride;
@ -71,6 +73,8 @@ $OptStatsType = 'all' if !$OptStatsType;
### get time period from --month
# get verbal description of time period, drop SQL code
my ($Period) = &GetTimePeriod($OptMonth);
# bail out if --month is invalid or set to 'ALL';
# we don't support the latter
&Bleat(2,"--month option has an invalid format - please use 'YYYY-MM' or ".
"'YYYY-MM:YYYY-MM'!") if (!$Period or $Period eq 'all time');
@ -160,7 +164,7 @@ foreach my $Month (&ListMonth($Period)) {
}
};
};
# delete old data for that month
if (!$OptTest) {
$DBQuery = $DBHandle->do(sprintf("DELETE FROM %s.%s WHERE month = ?",
@ -206,7 +210,7 @@ gatherstats - process statistical data from a raw source
=head1 SYNOPSIS
B<gatherstats> [B<-Vhdt>] [B<-m> I<YYYY-MM> | I<YYYY-MM:YYYY-MM>] [B<-s> I<stats>] [B<-c> I<filename template>]] [B<--hierarchy> I<TLH>] [B<--rawdb> I<database table>] [B<-groupsdb> I<database table>] [B<--clientsdb> I<database table>] [B<--hostsdb> I<database table>]
B<gatherstats> [B<-Vhdt>] [B<-m> I<YYYY-MM> | I<YYYY-MM:YYYY-MM>] [B<-s> I<stats>] [B<-c> I<filename template>]] [B<--hierarchy> I<TLH>] [B<--rawdb> I<database table>] [B<-groupsdb> I<database table>] [B<--clientsdb> I<database table>] [B<--hostsdb> I<database table>] [B<--conffile> I<filename>]
=head1 REQUIREMENTS
@ -257,7 +261,8 @@ override that default through the B<--groupsdb> option.
=head2 Configuration
B<gatherstats> will read its configuration from F<newsstats.conf>
which should be present in the same directory via Config::Auto.
which should be present in etc/ via Config::Auto or from a configuration file
submitted by the B<--conffile> option.
See L<doc/INSTALL> for an overview of possible configuration options.
@ -291,7 +296,7 @@ conjunction with B<--test> ... everything else seems a bit pointless.
Set processing period to a single month in YYYY-MM format or to a time
period between two month in YYYY-MM:YYYY-MM format (two month, separated
by a colon).
by a colon).
=item B<-s>, B<--stats> I<type>
@ -339,6 +344,10 @@ Override I<DBTableClnts> from F<newsstats.conf>.
Override I<DBTableHosts> from F<newsstats.conf>.
=item B<--conffile> I<filename>
Load configuration from I<filename> instead of F<newsstats.conf>.
=back
=head1 INSTALLATION
@ -368,15 +377,15 @@ checking against checkgroups-*:
=over 4
=item F<gatherstats.pl>
=item F<bin/gatherstats.pl>
The script itself.
=item F<NewsStats.pm>
=item F<lib/NewsStats.pm>
Library functions for the NewsStats package.
=item F<newsstats.conf>
=item F<etc/newsstats.conf>
Runtime configuration file.
@ -385,7 +394,7 @@ Runtime configuration file.
=head1 BUGS
Please report any bugs or feature requests to the author or use the
bug tracker at L<http://bugs.th-h.de/>!
bug tracker at L<https://code.virtcomm.de/thh/newsstats/issues>!
=head1 SEE ALSO
@ -405,11 +414,11 @@ This script is part of the B<NewsStats> package.
=head1 AUTHOR
Thomas Hochstein <thh@inter.net>
Thomas Hochstein <thh@thh.name>
=head1 COPYRIGHT AND LICENSE
Copyright (c) 2010-2012 Thomas Hochstein <thh@inter.net>
Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.

View file

@ -4,18 +4,19 @@
#
# This script will get statistical data on newgroup usage
# from a database.
#
#
# It is part of the NewsStats package.
#
# Copyright (c) 2010-2013 Thomas Hochstein <thh@inter.net>
# Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
#
# It can be redistributed and/or modified under the same terms under
# It can be redistributed and/or modified under the same terms under
# which Perl itself is published.
BEGIN {
our $VERSION = "0.01";
our $VERSION = "0.02";
use File::Basename;
push(@INC, dirname($0));
# we're in .../bin, so our module is in ../lib
push(@INC, dirname($0).'/../lib');
}
use strict;
use warnings;
@ -31,7 +32,7 @@ Getopt::Long::config ('bundling');
### read commandline options
my ($OptBoundType,$OptCaptions,$OptCheckgroupsFile,$OptComments,
$OptFileTemplate,$OptFormat,$OptGroupBy,$OptGroupsDB,$LowBound,$OptMonth,
$OptNewsgroups,$OptOrderBy,$OptReportType,$OptSums,$UppBound);
$OptNewsgroups,$OptOrderBy,$OptReportType,$OptSums,$UppBound,$OptConfFile);
GetOptions ('b|boundary=s' => \$OptBoundType,
'c|captions!' => \$OptCaptions,
'checkgroups=s' => \$OptCheckgroupsFile,
@ -47,6 +48,7 @@ GetOptions ('b|boundary=s' => \$OptBoundType,
'r|report=s' => \$OptReportType,
's|sums!' => \$OptSums,
'u|upper=i' => \$UppBound,
'conffile=s' => \$OptConfFile,
'h|help' => \&ShowPOD,
'V|version' => \&ShowVersion) or exit 1;
# parse parameters
@ -76,12 +78,19 @@ if ($OptReportType) {
$OptReportType = 'default';
}
}
# read list of newsgroups from --checkgroups
# into a hash reference
my $ValidGroups = &ReadGroupList($OptCheckgroupsFile) if $OptCheckgroupsFile;
# honor $OptCheckgroupsFile,
# warn for $OptSums if set concurrently
my $ValidGroups;
if ($OptCheckgroupsFile) {
# read list of newsgroups from --checkgroups
# into a hash reference
$ValidGroups = &ReadGroupList($OptCheckgroupsFile);
&Bleat(1,"--sums option can't possibly work with --checkgroups option set")
if $OptSums;
}
### read configuration
my %Conf = %{ReadConfig($HomePath.'/newsstats.conf')};
my %Conf = %{ReadConfig($OptConfFile)};
### override configuration via commandline options
my %ConfOverride;
@ -124,12 +133,17 @@ if ($OptBoundType and $OptBoundType ne 'default') {
}
### get sort order and build SQL 'ORDER BY' clause
# force to 'month' for $OptReportType 'average' or 'sum'
$OptGroupBy = 'month' if ($OptReportType and $OptReportType ne 'default');
# default to 'newsgroup' for $OptBoundType 'level' or 'average'
$OptGroupBy = 'newsgroup' if (!$OptGroupBy and
$OptBoundType and $OptBoundType ne 'default');
# force to 'month' for $OptReportType 'average' or 'sum'
$OptGroupBy = 'month' if ($OptReportType and $OptReportType ne 'default');
# default to 'newsgroup' if $OptGroupBy is not set and
# just one newsgroup is requested, but more than one month
$OptGroupBy = 'newsgroup' if (!$OptGroupBy and $OptMonth and $OptMonth =~ /:/
and $OptNewsgroups and $OptNewsgroups !~ /[:*%]/);
# parse $OptGroupBy to $GroupBy, create ORDER BY clause $SQLOrderClause
# if $OptGroupBy is still not set, SQLSortOrder() will default to 'month'
my ($GroupBy,$SQLOrderClause) = SQLSortOrder($OptGroupBy, $OptOrderBy);
# $GroupBy will contain 'month' or 'newsgroup' (parsed result of $OptGroupBy)
# set it to 'month' or 'key' for OutputData()
@ -244,7 +258,7 @@ if ($OptCaptions && $OptComments) {
($OptOrderBy and $OptOrderBy =~ /posting/i) ? 'by number of postings ' : '',
($OptOrderBy and $OptOrderBy =~ /-?desc$/i) ? 'descending' : 'ascending');
}
# output data
&OutputData($OptFormat,$OptComments,$GroupBy,$Precision,
$OptCheckgroupsFile ? $ValidGroups : '',
@ -263,7 +277,7 @@ groupstats - create reports on newsgroup usage
=head1 SYNOPSIS
B<groupstats> [B<-Vhcs> B<--comments>] [B<-m> I<YYYY-MM>[:I<YYYY-MM>] | I<all>] [B<-n> I<newsgroup(s)>] [B<--checkgroups> I<checkgroups file>] [B<-r> I<report type>] [B<-l> I<lower boundary>] [B<-u> I<upper boundary>] [B<-b> I<boundary type>] [B<-g> I<group by>] [B<-o> I<order by>] [B<-f> I<output format>] [B<--filetemplate> I<filename template>] [B<--groupsdb> I<database table>]
B<groupstats> [B<-Vhcs> B<--comments>] [B<-m> I<YYYY-MM>[:I<YYYY-MM>] | I<all>] [B<-n> I<newsgroup(s)>] [B<--checkgroups> I<checkgroups file>] [B<-r> I<report type>] [B<-l> I<lower boundary>] [B<-u> I<upper boundary>] [B<-b> I<boundary type>] [B<-g> I<group by>] [B<-o> I<order by>] [B<-f> I<output format>] [B<--filetemplate> I<filename template>] [B<--groupsdb> I<database table>] [B<--conffile> I<filename>]
=head1 REQUIREMENTS
@ -328,7 +342,8 @@ Captions and comments are automatically disabled in this case.
=head2 Configuration
B<groupstats> will read its configuration from F<newsstats.conf>
which should be present in the same directory via Config::Auto.
which should be present in etc/ via Config::Auto or from a configuration file
submitted by the B<--conffile> option.
See doc/INSTALL for an overview of possible configuration options.
@ -346,7 +361,7 @@ Print out version and copyright information and exit.
Print this man page and exit.
=item B<-m>, B<--month> I<YYYY-MM[:YYYY-MM]|all>
=item B<-m>, B<--month> I<YYYY-MM[:YYYY-MM]|all>
Set processing period to a single month in YYYY-MM format or to a time
period between two month in YYYY-MM:YYYY-MM format (two month, separated
@ -373,6 +388,9 @@ example:
See the B<gatherstats> man page for details.
This option does not work together with the B<--checkgroups> option as
all "virtual" groups will not be present in the checkgroups file.
=item B<--checkgroups> I<filename>
Restrict output to those newgroups present in a file in checkgroups format
@ -382,6 +400,9 @@ line is ignored). All other newsgroups will be removed from output.
Contrary to B<gatherstats>, I<filename> is not a template, but refers to
a single file in checkgroups format.
The B<--sums> option will not work together with this option as "virtual"
groups will not be present in the checkgroups file.
=item B<-r>, B<--report> I<default|average|sums>
Choose the report type: I<default>, I<average> or I<sums>
@ -592,6 +613,10 @@ B<--nocomments> is enforced, see above.
Override I<DBTableGrps> from F<newsstats.conf>.
=item B<--conffile> I<filename>
Load configuration from I<filename> instead of F<newsstats.conf>.
=back
=head1 INSTALLATION
@ -635,15 +660,15 @@ machine-readable form (without formatting):
=over 4
=item F<groupstats.pl>
=item F<bin/groupstats.pl>
The script itself.
=item F<NewsStats.pm>
=item F<lib/NewsStats.pm>
Library functions for the NewsStats package.
=item F<newsstats.conf>
=item F<etc/newsstats.conf>
Runtime configuration file.
@ -652,7 +677,7 @@ Runtime configuration file.
=head1 BUGS
Please report any bugs or feature requests to the author or use the
bug tracker at L<http://bugs.th-h.de/>!
bug tracker at L<https://code.virtcomm.de/thh/newsstats/issues>!
=head1 SEE ALSO
@ -676,11 +701,11 @@ This script is part of the B<NewsStats> package.
=head1 AUTHOR
Thomas Hochstein <thh@inter.net>
Thomas Hochstein <thh@thh.name>
=head1 COPYRIGHT AND LICENSE
Copyright (c) 2010-2012 Thomas Hochstein <thh@inter.net>
Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.

View file

@ -28,21 +28,21 @@ INSTALLATION INSTRUCTIONS
* Copy the sample configuration file newsstats.conf.sample to
newsstats.conf and modify it for your purposes:
# cp newsstats.conf.sample newsstats.conf
# vim newsstats.conf
# cp etc/newsstats.conf.sample etc/newsstats.conf
# vim etc/newsstats.conf
a) Mandatory configuration options
* DBDriver = mysql
Database driver used; currently only mysql is supported.
* DBHost = localhost
The host your mysql server is running on.
* DBUser =
The username to connect to the database server.
* DBPw =
* DBPw =
Matching password for your username.
* DBDatabase = newsstats
@ -61,17 +61,17 @@ INSTALLATION INSTRUCTIONS
* TLH = de
Limit examination to that top-level hierarchy.
3) Database (mysql) setup
* Setup your database server with a username, password and
database matching the NewsStats configuration (see 2 a).
* Start the installation script:
# install/install.pl
It will setup the necessary database tables and display some
It will setup the necessary database tables and display some
information on the next steps.
4) Feed (INN) setup

View file

@ -73,17 +73,15 @@ Getting Started
Reporting Bugs
You can report bugs or feature requests to the author using the
bug tracker at <http://bugs.th-h.de/>.
issue tracker at <https://code.virtcomm.de/thh/newsstats/issues>.
Please have a look at the TODO list before suggesting
improvements.
More Information
Development
This program is maintained using the Git version control system.
You may clone <git://code.th-h.de/usenet/newsstats.git> to check
out the current development tree or browse it on the web via
<http://code.th-h.de/?p=usenet/newsstats.git>.
This program is maintained using the Git version control system at
<https://code.virtcomm.de/thh/newsstats/>.
Related projects
@ -93,6 +91,6 @@ Related projects
Author
Thomas Hochstein <thh@inter.net>
<http://th-h.de/>
Thomas Hochstein <thh@thh.name>
<https://th-h.de/>

View file

@ -4,8 +4,6 @@
This is a list of planned bug fixes, improvements and enhancements for
NewsStats.
Bug numbers refer to the Mantis issue tracker at <http://bugs.th-h.de/>.
* General
- Improve Documentation
The documentation is rather sparse and could use some improvement.
@ -18,7 +16,7 @@ Bug numbers refer to the Mantis issue tracker at <http://bugs.th-h.de/>.
to /usr/local/news/etc or /etc/news and so on
* Additional features
- Add hierarchy information (GroupInfo - Bugs #19 #20 #21 #22 #23 #24 #25 #26)
- Add hierarchy information (GroupInfo)
NewsStats should be able to recognize invalid (i.e. officially not existing)
newsgroups and - optionally - drop them from the list of groups. On the
other hand, it should recognize existing, but empty groups and add them with
@ -37,6 +35,19 @@ Bug numbers refer to the Mantis issue tracker at <http://bugs.th-h.de/>.
NewsStats should offer tools e.g. to inject postings into the 'raw' database,
or to split databases.
* GroupInfo project
- Create a hierarchy information database, containing information on each
newsgroup, its creation and removal time, its tagline, charter and
moderation status, including the moderator contact address.
- Automatically update hierarchy information, e.g. by parsing control messages
(with verification!).
- Track changes in meta information (changes to tagline, charter, moderation
status etc.)
- Add tools to query for hierarchy information:
- canonical list of newsgroups for any given time
- generate list of changes for a time period
- find newsgroups (including wildcards) and display their history
* Individual improvements
+ NewsStats.pm
- Improve error handling when reading config

View file

@ -3,27 +3,25 @@
# install.pl
#
# This script will create database tables as necessary.
#
#
# It is part of the NewsStats package.
#
# Copyright (c) 2010-2013 Thomas Hochstein <thh@inter.net>
# Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
#
# It can be redistributed and/or modified under the same terms under
# It can be redistributed and/or modified under the same terms under
# which Perl itself is published.
BEGIN {
our $VERSION = "0.01";
our $VERSION = "0.02";
use File::Basename;
# we're in .../install, so our module is in ..
push(@INC, dirname($0).'/..');
# we're in .../install, so our module is in ../lib
push(@INC, dirname($0).'/../lib');
}
use strict;
use warnings;
use NewsStats qw(:DEFAULT);
use Cwd;
use DBI;
use Getopt::Long qw(GetOptions);
Getopt::Long::config ('bundling');
@ -31,18 +29,15 @@ Getopt::Long::config ('bundling');
################################# Main program #################################
### read commandline options
my ($OptUpdate);
my ($OptUpdate,$OptConfFile);
GetOptions ('u|update=s' => \$OptUpdate,
'conffile=s' => \$OptConfFile,
'h|help' => \&ShowPOD,
'V|version' => \&ShowVersion) or exit 1;
### change working directory to .. (as we're in .../install)
chdir dirname($FullPath).'/..';
my $Path = cwd();
### read configuration
print("Reading configuration.\n");
my %Conf = %{ReadConfig($Path.'/newsstats.conf')};
my %Conf = %{ReadConfig($OptConfFile)};
##### --------------------------------------------------------------------------
##### Database table definitions
@ -53,9 +48,9 @@ CREATE DATABASE IF NOT EXISTS `$Conf{'DBDatabase'}` DEFAULT CHARSET=utf8;
SQLDB
my %DBCreate = ('DBTableRaw' => <<RAW, 'DBTableGrps' => <<GRPS);
--
--
-- Table structure for table DBTableRaw
--
--
CREATE TABLE IF NOT EXISTS `$Conf{'DBTableRaw'}` (
`id` bigint(20) unsigned NOT NULL auto_increment,
@ -76,9 +71,9 @@ CREATE TABLE IF NOT EXISTS `$Conf{'DBTableRaw'}` (
KEY `peer` (`peer`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Raw data';
RAW
--
--
-- Table structure for table DBTableGrps
--
--
CREATE TABLE IF NOT EXISTS `$Conf{'DBTableGrps'}` (
`id` bigint(20) unsigned NOT NULL auto_increment,
@ -108,7 +103,7 @@ Things left to do:
## gather statistics for NewsStats
newsstats!\\
:!*,de.*\\
:Tc,WmtfbsPNH,Ac:$Path/feedlog.pl
:Tc,WmtfbsPNH,Ac:$HomePath/bin/feedlog.pl
Please
@ -136,7 +131,7 @@ Things left to do:
Enjoy!
-thh <thh\@inter.net>
-thh <thh\@thh.name>
INSTALL
my $Upgrade ='';
@ -167,7 +162,7 @@ if (!$OptUpdate) {
my $DBQuery = $DBHandle->prepare($DBCreate);
$DBQuery->execute() or &Bleat(2, sprintf("Can't create database %s: %s%\n",
$Conf{'DBDatabase'}, $DBI::errstr));
printf("Database table %s created succesfully.\n",$Conf{'DBDatabase'});
$DBHandle->disconnect;
};
@ -185,7 +180,7 @@ if (!$OptUpdate) {
&CreateTable($Table);
};
print "Database table generation done.\n";
# Display install instructions
print $Install;
} else {
@ -255,7 +250,7 @@ install - installation script
=head1 SYNOPSIS
B<install> [B<-Vh> [--update I<version>]
B<install> [B<-Vh> [--update I<version>] [B<--conffile> I<filename>]
=head1 REQUIREMENTS
@ -267,8 +262,9 @@ This script will create database tables as necessary and configured.
=head2 Configuration
B<install> will read its configuration from F<newsstats.conf> via
Config::Auto.
B<install> will read its configuration from F<newsstats.conf> which should
be present in etc/ via Config::Auto or from a configuration file submitted
by the B<--conffile> option.
See L<doc/INSTALL> for an overview of possible configuration options.
@ -288,21 +284,25 @@ Print this man page and exit.
Don't do a fresh install, but update from I<version>.
=item B<--conffile> I<filename>
Load configuration from I<filename> instead of F<newsstats.conf>.
=back
=head1 FILES
=over 4
=item F<install.pl>
=item F<install/install.pl>
The script itself.
=item F<NewsStats.pm>
=item F<lib/NewsStats.pm>
Library functions for the NewsStats package.
=item F<newsstats.conf>
=item F<etc/newsstats.conf>
Runtime configuration file.
@ -311,7 +311,7 @@ Runtime configuration file.
=head1 BUGS
Please report any bugs or feature requests to the author or use the
bug tracker at L<http://bugs.th-h.de/>!
bug tracker at L<https://code.virtcomm.de/thh/newsstats/issues>!
=head1 SEE ALSO
@ -331,11 +331,11 @@ This script is part of the B<NewsStats> package.
=head1 AUTHOR
Thomas Hochstein <thh@inter.net>
Thomas Hochstein <thh@thh.name>
=head1 COPYRIGHT AND LICENSE
Copyright (c) 2010-2012 Thomas Hochstein <thh@inter.net>
Copyright (c) 2010-2013 Thomas Hochstein <thh@thh.name>
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.

View file

@ -4,7 +4,7 @@
#
# Copyright (c) 2010-2013 Thomas Hochstein <thh@inter.net>
#
# This module can be redistributed and/or modified under the same terms under
# This module can be redistributed and/or modified under the same terms under
# which Perl itself is published.
package NewsStats;
@ -49,20 +49,24 @@ require Exporter;
Output => [qw(OutputData FormatOutput)],
SQLHelper => [qw(SQLHierarchies SQLSortOrder SQLGroupList
SQLSetBounds SQLBuildClause GetMaxLength)]);
$VERSION = '0.01';
our $PackageVersion = '0.01';
$VERSION = '0.02';
our $PackageVersion = '0.02';
use Data::Dumper;
use File::Basename;
use Cwd qw(realpath);
use Config::Auto;
use DBI;
#####-------------------------------- Vars --------------------------------#####
# trim the path
# save $0 in $FullPath
our $FullPath = $0;
our $HomePath = dirname($0);
# strip filename and /bin or /install directory to create the $HomePath
our $HomePath = dirname(realpath($0));
$HomePath =~ s/\/(bin|install)//;
# trim $0
$0 =~ s%.*/%%;
# set version string
our $MyVersion = "$0 $::VERSION (NewsStats.pm $VERSION)";
@ -76,7 +80,7 @@ sub ShowVersion {
################################################################################
### display version and exit
print "NewsStats v$PackageVersion\n$MyVersion\n";
print "Copyright (c) 2010-2012 Thomas Hochstein <thh\@inter.net>\n";
print "Copyright (c) 2010-2013 Thomas Hochstein <thh\@inter.net>\n";
print "This program is free software; you may redistribute it ".
"and/or modify it under the same terms as Perl itself.\n";
exit(100);
@ -99,6 +103,8 @@ sub ReadConfig {
### IN : $ConfFile: config filename
### OUT: reference to a hash containing the configuration
my ($ConfFile) = @_;
# set default
$ConfFile = $HomePath . '/etc/newsstats.conf' if !$ConfFile;
# mandatory configuration options
my @Mandatory = ('DBDriver','DBHost','DBUser','DBPw','DBDatabase',
'DBTableRaw','DBTableGrps');
@ -238,7 +244,7 @@ sub ReadGroupList {
### ignoring everything after the first whitespace and so accepting files
### in checkgroups format as well as (parts of) an INN active file)
### IN : $Filename : file to read
### OUT: \%ValidGroups: hash containing all valid newsgroups
### OUT: \%ValidGroups: reference to a hash containing all valid newsgroups
my ($Filename) = @_;
my %ValidGroups;
open (my $LIST,"<$Filename") or &Bleat(2,"Cannot read $Filename: $!");
@ -269,12 +275,12 @@ sub GetTimePeriod {
my ($Verbal, $SQL);
# define a regular expression for a month
my $REMonth = '\d{4}-\d{2}';
# default to last month if option is not set
if(!$Month) {
$Month = &LastMonth;
}
# check for valid input
if ($Month =~ /^$REMonth$/) {
# single month (YYYY-MM)
@ -293,7 +299,7 @@ sub GetTimePeriod {
# invalid input
return (undef,undef);
}
return ($Verbal,$SQL);
};
@ -401,7 +407,7 @@ sub OutputData {
my %ValidKeys = %{$ValidKeys} if $ValidKeys;
my ($FileName, $Handle, $OUT);
our $LastIteration;
# define output types
my %LegalOutput;
@LegalOutput{('dump','list','pretty')} = ();
@ -433,7 +439,7 @@ sub OutputData {
# safeguards for filename creation:
# replace potential problem characters with '_'
$FileName = sprintf('%s-%s',$FileTempl,$Caption);
$FileName =~ s/[^a-zA-Z0-9_-]+/_/g;
$FileName =~ s/[^a-zA-Z0-9_-]+/_/g;
open ($OUT,">$FileName")
or &Bleat(2,sprintf("Cannot open output file '%s': $!",
$FileName));
@ -668,7 +674,7 @@ sub SQLSetBounds {
### OUT: SQL code to become part of a WHERE or HAVING clause
my ($Type,$LowBound,$UppBound) = @_;
($LowBound,$UppBound) = SQLCheckNumber($LowBound,$UppBound);
if($LowBound and $UppBound and $LowBound > $UppBound) {
if($LowBound and defined($UppBound) and $LowBound > $UppBound) {
&Bleat(1,"Lower boundary $LowBound is larger than Upper boundary ".
"$UppBound, exchanging boundaries.");
($LowBound,$UppBound) = ($UppBound,$LowBound);
@ -684,7 +690,7 @@ sub SQLSetBounds {
} elsif ($Type eq 'sum') {
$WhereHavingFunction = 'SUM(postings)'
}
$LowBound = sprintf('%s >= '.$LowBound,$WhereHavingFunction) if ($LowBound);
$LowBound = sprintf('%s >= '.$LowBound,$WhereHavingFunction) if defined($LowBound);
# set $LowBound to SQL statement:
# 'WHERE postings <=', 'HAVING MAX(postings) <=' or 'HAVING AVG(postings) <='
if ($Type eq 'level') {
@ -694,7 +700,7 @@ sub SQLSetBounds {
} elsif ($Type eq 'sum') {
$WhereHavingFunction = 'SUM(postings)'
}
$UppBound = sprintf('%s <= '.$UppBound,$WhereHavingFunction) if ($UppBound);
$UppBound = sprintf('%s <= '.$UppBound,$WhereHavingFunction) if defined($UppBound);
return ($LowBound,$UppBound);
};
@ -770,5 +776,3 @@ sub CheckValidNewsgroups {
#####------------------------------- done ---------------------------------#####
1;