Commit graph

89 commits

Author SHA1 Message Date
Thomas Hochstein aef5467bfe Handle more than one entitiy in From: etc.
From:, Sender: etc. may contain more than one
entity in a comma separated list, i.e. a From:
line like
"From: Me <me@example.com>, You <you@example.com>"
is perfectly valid.

Handle multiple entities when splitting those
headers and save all names and all adresses
as (new) comma separated lists in the
corresponding database fields.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-08 17:53:21 +02:00
Thomas Hochstein ca8ac4d50f Let gatherstats read its data from DBTableParse.
Switch gatherstat.pl over to the parsed database.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-08 17:53:19 +02:00
Thomas Hochstein 9630376c31 Add decoding and parsing of From: etc.
Decode From:, Sender:, Reply-To:, Subject:;
parse From:, Sender:, Reply-To:.

Add Mail::Address to prerequisites.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-08 17:53:17 +02:00
Thomas Hochstein 6d72dad2c0 Create a database table with parsed raw data.
Incoming data is written to DBTableRaw without
much interpretation. To allow for more and
better analysis that raw data should be parsed
daily and copied to another database table
with separate fields for most header lines.
All other scripts could use that pre-parsed
data.

* Add database schema to install.pl
* Add DBTableParse to newsstats.conf.sample
  and as mandatory to NewsStats.pm
* Add parsedb.pl

TODO:
- Documentation is only rudimentary.
- From:, Sender:, Reply-To: and Subject:
  are not yet parsed.
- gatherstats.pl does not yet use DbTableParse.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-08 17:27:50 +02:00
Thomas Hochstein 3634010808 Make GetTimePeriod() and others accept days.
GetTimePeriod() was written to take a month
('YYYY-MM') and work with that. Make it accept
not only a month, but also a day ('YYYY-MM-DD')
by adding a $TYpe modifier.

Rename LastMonth() to LastMonthDay() and rewrite
it accordingly.

Rename CheckMonth() to CheckPeriod() and rewrite
it accordingly.

As GetTimePeriod() defaults to 'month' if no
modifier is passed this change should be backwards
compatible.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 22:44:36 +02:00
Thomas Hochstein 599fefbf6a Merge branch 'thh-bug51' into next
* thh-bug51:
  One more default sorting order ("grouping").
2013-09-03 22:25:23 +02:00
Thomas Hochstein 7624accb6e Merge branch 'thh-small-changes' into next
* thh-small-changes:
  Small comment fixes.
  --sums is not compatible with --checkgroups.
2013-09-03 22:25:13 +02:00
Thomas Hochstein 8dc6823e98 Small comment fixes.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 17:12:09 +02:00
Thomas Hochstein 17ef44085f --sums is not compatible with --checkgroups.
'Virtual' .ALL groups will never be present in
a checkgroups file, and we can't use them anyway
as they would contain postings from groups that
are filtered out by --checkgroups.

Add a warning, put a note in the documentation.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 15:10:07 +02:00
Thomas Hochstein ea91003a99 One more default sorting order ("grouping").
If --group-by is not set, output will be grouped
by month by default (as long as --boundary is
not set to 'level' or 'average', where grouping
by newsgroup is default).

Now we default to 'newsgroup' if just one newsgroup
is requested by --newsgroups, but more than one
month by --month.

Both defaults can be overridden.

But forced --group-by=month for --report type
'average' or 'sum' in front so defaults are
not checked.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 14:56:17 +02:00
Thomas Hochstein 1af57a5390 Merge branch 'thh-restructure' into next
* thh-restructure:
  Make configuration file configurable.
  Fix some whitespace.
  Redo directory structure.
2013-09-03 14:55:42 +02:00
Thomas Hochstein 23ab67a099 Make configuration file configurable.
Add --conffile option to all scripts to
overrride standard config file location
etc/newsstats.conf.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 10:01:20 +02:00
Thomas Hochstein dfc2b81c37 Fix some whitespace.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 10:01:18 +02:00
Thomas Hochstein 2ad99c20bc Redo directory structure.
* Move all scripts to /bin
* Move configuration to /etc
* Move NewsStats.pm to /lib
* Add new path to NewsStats.pm to all scripts
* Set $HomePath to top level directory
* Move setting of config file name to ReadConf()

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-03 10:01:16 +02:00
Thomas Hochstein 07c0b2589a Release 0.01
Update TODO list.
Update version numbers, ChangeLog, bump copyright
dates.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 13:14:33 +02:00
Thomas Hochstein 3d2fd51dd0 Merge branch 'next'
* next: (26 commits)
  Some documentation fixes and enhancments.
  Improve INSTALL documentation.
  README: Update copyright notice.
  README: improve phrasing.
  Change handling of warnings.
  Improve output padding.
  Check for invalid newsgroup names.
  Add some basic validation to config parser.
  Create better newsgroup lists for SQL clause.
  Fix config path detection for install.pl.
  Get empty 'virtual' hierarchies working.
  Add some TODO entries.
  Add database creation to installer.
  Handle undefined previous version when installing.
  Refactor database initialisation in feedlog.pl.
  Add empty 'virtual' .ALL hierarchies as needed.
  Change interpretation of --checkgroups to template
  Be more fault-tolerant when reading checkgroups.
  Remove call to &Bleat where not appropriate.
  Allow more characters in TLH definitions.
  ...
2013-09-02 13:08:39 +02:00
Thomas Hochstein 25b25735dd Merge branch 'language' into next
* language:
  Some documentation fixes and enhancments.
  Improve INSTALL documentation.
  README: Update copyright notice.
  README: improve phrasing.
2013-09-02 13:00:33 +02:00
Thomas Hochstein a036e9da62 Merge branch 'thh-bug13' into next
* thh-bug13:
  Add some basic validation to config parser.
2013-09-02 13:00:23 +02:00
Thomas Hochstein 38fa44f89b Merge branch 'thh-bug53' into next
* thh-bug53:
  Improve output padding.
2013-09-02 13:00:05 +02:00
Thomas Hochstein 7c83a673e6 Merge branch 'thh-bug37' into next
* thh-bug37:
  Create better newsgroup lists for SQL clause.
2013-09-02 12:59:57 +02:00
Thomas Hochstein 5cfcb1c061 Merge branch 'thh-checkinput' into next
* thh-checkinput:
  Check for invalid newsgroup names.
2013-09-02 12:59:45 +02:00
Thomas Hochstein 439c85a280 Merge branch 'thh-warnings' into next
* thh-warnings:
  Change handling of warnings.
2013-09-02 12:59:35 +02:00
Thomas Hochstein 95d9fe2cfd Some documentation fixes and enhancments.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:37:13 +02:00
Thomas Hochstein b31f785064 Improve INSTALL documentation.
Fix wrong example for newsfeeds entry (lacking
backslashes).

Fix typos, update copyright, improve phrasing.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:37:10 +02:00
Thomas Hochstein 435a99783c README: Update copyright notice.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:37:08 +02:00
Daniel Weber 6e6c520f94 README: improve phrasing.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:37:05 +02:00
Thomas Hochstein 3f817eb428 Change handling of warnings.
Replace 'perl -W' by 'use warnings;'.
The latter is preferred, and '-W'
(instead of '-w') was causing problems with
warnings in DB::mysql::GetInfo.pm.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:33:51 +02:00
Thomas Hochstein b342fcf030 Improve output padding.
Take 'length' of numbers in account.

Change GetMaxLength() accordingly and use that
new information in FormatOutput().

Fixes #53.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:32:07 +02:00
Thomas Hochstein c30822b48b Check for invalid newsgroup names.
Add check to SQLGroupList() and act on it
in groupstats.pl.

Issue #12.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:31:55 +02:00
Thomas Hochstein db7696e550 Add some basic validation to config parser.
We check for empty mandatory options for
starters.

Fixes #13 ... so we can release RSN :)

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:31:41 +02:00
Thomas Hochstein 10459ac8c7 Create better newsgroup lists for SQL clause.
Build a 'IN(...)' list for single newsgroup
names without wildcards. Create SQL clause
with a mix of wildcards and wildcard-less
group names.

More code for a better query ...

Fixes #37.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:30:56 +02:00
Thomas Hochstein d28168419e Merge branch 'installation' into next
* installation:
  Fix config path detection for install.pl.
  Add some TODO entries.
  Add database creation to installer.
  Handle undefined previous version when installing.
2013-09-02 12:19:11 +02:00
Thomas Hochstein 512781bd92 Fix config path detection for install.pl.
Make use of $Path - which is set and checked
to display a correct 'newsfeeds' example - to
load our configuration from the correct location.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-02 12:17:27 +02:00
Thomas Hochstein cce2fd0b7f Merge branch 'gatherstats' into next
* gatherstats:
  Get empty 'virtual' hierarchies working.
2013-09-02 06:46:40 +02:00
Thomas Hochstein 1703b8e3b4 Get empty 'virtual' hierarchies working.
Commit b5125b1099
was broken.

We didn't add empty .ALL hierarchies as needed;
we added empty (non-existant) hierarchies without
appended '.ALL', and didn't add the original
empty group we started with.

(What's more, gatherstats didn't even start any
more due to missing ex- and import of
&ParseHierarchies from NewsStats.pm.)

Fixes #52 (and some more breakage).

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-09-01 17:52:21 +02:00
Thomas Hochstein 7b310df13f Add some TODO entries.
Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-08-11 22:51:11 +02:00
Thomas Hochstein 36cffe7aed Add database creation to installer.
It's not enough to create tables, one should
create the database first if it is still
missing ...

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-08-11 22:16:24 +02:00
Thomas Hochstein da6fc073ee Handle undefined previous version when installing.
$OptUpdate is undefined when not upgrading, so don't
prepare an upgrade notice to avoid calling an
undefined variable.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-08-11 22:05:25 +02:00
Thomas Hochstein 1c3adb8de2 Merge branch 'feedlog' into next 2013-08-11 22:00:05 +02:00
Thomas Hochstein 98563c619e Refactor database initialisation in feedlog.pl.
* Move database initialisation to a separate function.

* (Re-)try to connect every five seconds
  (instead of going into an endless loop) and
  log successful (re-)connections.

* Log postings that are dropped due to database failures
  to syslog (Message-ID) for recovery.

* If the connection to the database is lost, try to
  recover it (every five seconds) and try again to
  write the pending data.

* Input will be buffered automatically by INN until
  feedlog is able to process it (see man 5 newsfeeds).

Fixes #30, #31.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-08-11 21:59:31 +02:00
Thomas Hochstein c3973e7d0d Merge branch 'gatherstats' into next 2013-08-11 21:56:37 +02:00
Thomas Hochstein a915469e0c Merge branch 'rewrite' into next 2013-08-11 21:56:18 +02:00
Thomas Hochstein b5125b1099 Add empty 'virtual' .ALL hierarchies as needed.
When using a --checkgroups file while tabulating,
valid but empty groups will be added with a posting
count of zero as needed. If all groups in a
sub-hierarchy are empty, the virtual '.ALL' group
for that sub-hierarchy was not created, though.

If local.test.dummy and local.test.binary were
both empty, both groups were added with a posting
count of '0', but local.test.ALL was not.

Now we loop through all hierarchy elements using
ParseHierarchies and add empty .ALL hierarchies as
needed.

Fixes #49.

Also fixing a typo in some comment. :-)

Signed-off-by: Thomas Hochstein <thh@inter.net>
2013-08-11 09:45:00 +02:00
Thomas Hochstein 93c8eae2ed Change interpretation of --checkgroups to template
In most hierarchies, the list of valid newsgroups will
change over time, so you'll have to use another
checkgroups file for each month. gatherstats will now
understand the value of --checkgroups to be a template
and amend it with each month it is processing.

Documentation changed accordingly.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:52:24 +02:00
Thomas Hochstein 7662b1065e Be more fault-tolerant when reading checkgroups.
* Accept lines starting with whitespace.

* Drop empty "groups", i.e. lines containing only
  whitespace.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:52:13 +02:00
Thomas Hochstein 0dc13b3980 Remove call to &Bleat where not appropriate.
Some warn()ings are used for debugging purposes.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:52:02 +02:00
Thomas Hochstein 314e31aadf Allow more characters in TLH definitions.
TLH may now also contain literal dots '.',
allowing for using second or third level
hierarchies as "TLH". To faciliate that,
'+' and '-' will be allowed, too.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:51:51 +02:00
Thomas Hochstein 7773fb6d8f Match TLHs correctly, not only partially.
The TLH was checked to match the beginning
of the newsgroup name, not the whole TLH part.
So the TLH "de" would match not only "de.test",
but also "denver.test", which was not the
desired outcome.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:51:40 +02:00
Thomas Hochstein 43a0fc7769 Fix parsing of more than one TLH in config.
The code introduced in 17ffbebad5
did not check the correct variable for being an array.

Improve an unrelated comment, too.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:51:29 +02:00
Thomas Hochstein 1fa9479946 Adapt install.pl to new coding style.
* Switch to Getopt::Long, change coding style;
  limit line length.

* Replace 'die' and 'warn' by calls to &Bleat().

* Completely changed options due to new
  GetOpt::Long processing.

* Adapt to changes in NewsStats.pm

* Redo documentation.

* Update TODO.

Signed-off-by: Thomas Hochstein <thh@inter.net>
2012-10-13 00:44:40 +02:00