PSK Automatic Propagation Reporter -- for Developers

This is a project to automatically gather reception records of PSK activity and then make those records available in near realtime to interested parties — typically the amateur who initiated the communication. The way that it works is that many amateurs will run a client that will monitor received traffic for callsigns (the pattern 'de callsign callsign') and, when seen, will report this fact. This is of interest to the amateur who transmitted adn they will be able to see where their signal was received. The pattern chosen is typically part of a standard CQ call. The duplicate check is to make sure that the callsign is not corrupted.

The way that this would be used is that an amateur would call CQ and could then (within a few minutes) see where his signal was received. This can be useful in determining propagation conditions or in adjusting antenna and/or radio parameters. It will also provide an archive of reception records that can be used for research purposes.

There are a number of parts to this project, as shown in the picture below. Each is dealt with separately.

Data Gathering

The data is gathered (somehow) from the client used to decode the PSK traffic. This is not standardized as it depends on the details of the client. Some clients are capable of decoding the entire audio passband simultaneously and these can provide a great deal of useful information.

The data consists of the calling callsign (at a minimum). Highly desirable extra fields are the frequency, signal to noise ratio and intermodulation distortion. Note that each callsign should be reported no more than once per five minute period. Ideally, a callsign should be reported only once per hour if it has not 'changed'. Precisely what constitutes a change is left to the discretion of the developer, but the goal is to minimize the number of database records!

Authors of PSK software are encouraged to implement the data reporting protocol as described in the next section.

Data Reporting

Once the data has been gathered (and duplicate callsigns eliminated), it is formatted into UDP datagrams and transmitted to report.psk.gladstonefamily.net port 4739. The datagram format is draft-ietf-ipfix-protocol-26.txt.

The datagrams should be sent at a rate of no more than one every five minutes (unless the packet becomes full). Timed sends of packets must not be synchronized to the system clock. Any flushing of packets on timers can be based on when the program started. If it is based on when the last signal was received, then please add some randomization to the five minute timer. This is to prevent a significant number of stations from reporting at the same time.

The IPFIX packet format contains (at a very high level) two parts — a number of templates that describe the format of the data part, and the data part. The template part only need be transmitted once per hour, but probably should be transmitted for the first three packets when the application starts (this is due to the lossy nature of UDP). Note that the same UDP source port number should be used to send each of the datagrams.

For the technically inclined, read the next section. For the people who just want to hack together the packet sender, skip the next section, and move to the Cookie Cutter section.

Technical IPFIX Information

The attributes used for this application are:
NameAttribute IdTypeMeaning
senderCallsign30351.1stringThe callsign of the sender of the transmission.
receiverCallsign30351.2stringThe callsign of the receiver of the transmission.
senderLocator30351.3stringThe locator of the sender of the transmission.
receiverLocator30351.4stringThe locator of the receiver of the transmission.
frequency30351.5unsignedIntegerThe frequency of the transmission in Hertz.
sNR30351.6integerThe signal to noise ration of the transmission. Normally 1 byte.
iMD30351.7integerThe intermodulation distortion of the transmission. Normally 1 byte.
decoderSoftware30351.8stringThe name and version of the decoding software.
antennaInformation30351.9stringA freeform description of the receiving antenna.
mode30351.10stringThe mode of the communication. One of the ADIF values for MODE.
informationSource30351.11stringIdentifies the source of the record. The bottom 2 bits have the following meaning: 1 = Automatically Extracted. 2 = From a Call Log. 3 = Other Manual Entry. The 0x80 bit indicates that this record is a test transmission.
persistentIdentifier30351.12stringRandom string that identifies the sender. This may be used in the future as a primitive form of security.
flowStartSeconds150dateTimeSeconds (Integer)The time of the transmission (absolute seconds since 1/1/1970).

Most of the attributes are enterprise specific, and use the enterprise number 30351 which is registered to me.

Some of these attributes are used in data records, and some are used in a single option record. The option record should appear in each datagram.

The data record should contain as many as possible of: senderCallsign, frequency, iMD, sNR, flowStartSeconds. The option record should contain as many as possible of: receiverCallsign, receiverLocator, decodingSoftware, anntennaInformation. The variable length encoding should be used for string fields. This probably gives about 16 bytes (on average) per reception record, and so a datagram can hold 80 to 90 records. This limit is unlikely to be reached in five minutes.

The templates should be transmitted at least once per hour, and n the first few packets on startup.

Cookie Cutter Information

The packet format appears to be complex, but there is a lot of boilerplate. The packet is assembled out of a set of pieces, with the actual data appended to the end.

All values are transmitted in 'network order'. This is with the most significant byte of an integer being transmitted first. Note that this not the same byte ordering as a PC. The C functions htonl/htons will do the appropriate conversions (for all platforms).

For a definition of the fields, see the attributes table above.

Header

The header contains the overall length of the packet. It also contains a sequence number and the time of transmission. All times are 'unix times' — i.e. the number of seconds since 1/1/1970 00:00 UTC. This is the value returned by time(0). The sequence number allows detection of missed and duplicate packets.
00 0A ll ll tt tt tt tt ss ss ss ss ii ii ii ii
'll ll' is the two bye length code that is the length of the entire datagram. 'tt tt tt tt' is the transmission time, and 'ss ss ss ss' is the sequence number. 'ii ii ii ii' is a random identifier that helps associate packets with UDP streams. This is needed to deal with nasty cases of residential NAT/PAT gateways and DHCP.

Templates

The templates define the precise layout of the data portion of the datagram. The templates do not need to appear in every datagram. They are cached by the receiving process. However, it is good practice to send them in the first three datagrams sent on application startup, and then once per hour thereafter.

There are two templates, one for the reception record, and one for the local station record. In each case, there are a number of different templates to pick — they have different fields. Pick the one that matches the data that you have available.

The station information record has two options.

For receiverCallsign, receiverLocator, decodingSoftware use

00 03 00 24 01 18 00 03 00 00 
80 02 FF FF 00 00 76 8F 
80 04 FF FF 00 00 76 8F 
80 08 FF FF 00 00 76 8F 
00 00

For receiverCallsign, receiverLocator, decodingSoftware, anntennaInformation use

00 03 00 2C 01 18 00 04 00 00 
80 02 FF FF 00 00 76 8F 
80 04 FF FF 00 00 76 8F 
80 08 FF FF 00 00 76 8F 
80 09 FF FF 00 00 76 8F 
00 00

The reception record has two options (if your software does not fit nicely into either set, then please contact me at the address below, at I will generate a template record specifically for you).

For senderCallsign, frequency, flowStartSeconds use:

00 02 00 1C 01 2C 00 03 
80 01 FF FF 00 00 76 8F  
80 05 00 04 00 00 76 8F 
00 96 00 04 

For senderCallsign, frequency, sNR, iMD, flowStartSeconds use:

00 02 00 2C 01 2C 00 05 
80 01 FF FF 00 00 76 8F  
80 05 00 04 00 00 76 8F 
80 06 00 04 00 00 76 8F 
80 07 00 04 00 00 76 8F 
00 96 00 04 

Data

The first data record is the station information record. This has the following header
01 18 ll ll 
'll ll' is the two bye length code that is the length of the station information record, including the header and padding.

The data that follows is encoded as three (or four) fields of byte length code followed by UTF-8 (use ASCII if you don't know what UTF-8 is) data. The length code is the number of bytes of data and does not include the length code itself. Each field is limited to a length code of no more than 254 bytes. Finally, the record is null padded to a multiple of 4 bytes.

For example, to encode N1DQ, FN42hn, Homebrew v5.6, the datagram fragment would look like:

01 18 00 20 
04 4E 31 44 51 
06 46 4E 34 32 68 6E 
0D 48 6F 6D 65 62 72 65 77 20 76 35 2E 36 
00 00 

After this (finally!) come the reception records. This has the following header

01 2C ll ll
'll ll' is the two bye length code that is the length of the all the reception records, including the header and padding.

The data that follows is encoded as a sequence of records. Each of which contains the number of fields from the template that you chose above. The callsign is encoded as a string using the byte length code format as described above. The frequency is a four byte integer (in network order). The iMD and sNR are single bytes (i.e. -128 to +127) and are only present if you chose that template. The flowStartSeconds is a four byte integer (in network order) that records the time (the value of time(0)) that the callsign was recognized.

There is no padding between the records, but there is padding (with null bytes) to a multiple of four at the end.

For example, to encode (using the first template) the following two reception records: N1DQ, 14070567, 1200960084 (some time about 2008-01-22 00:00Z) and KB1MBX, 14070987, 1200960104 (some time about 2008-01-22 00:00Z), the datagram fragment would look like:

01 2C 00 20 
 04 4E 31 44 51 
 00 D6 B3 27 
 47 95 32 54 

 06 4B 42 31 4D 42 58 
 00 D6 B4 CB 
 47 95 32 68

An example of a complete datagram (using the sample data above), sent at 1200960114, with sequence number 1:

00 0A 00 90 47 95 32 72 00 00 00 01 00 00 00 00  

00 02 00 1C  01 2C 00 03  80 01 FF FF 00 00 76 8F  80 05 00 04 00 00 76 8F  00 96 00 04 

00 03 00 24  01 18 00 03 00 00  80 02 FF FF 00 00 76 8F  80 04 FF FF 00 00 76 8F  80 08 FF FF 00 00 76 8F  00 00  

01 18 00 20  04 4E 31 44 51  06 46 4E 34 32 68 6E  0D 48 6F 6D 65 62 72 65 77 20 76 35 2E 36  00 00 

01 2C 00 20  04 4E 31 44 51  00 D6 B3 27  47 95 32 54   06 4B 42 31 4D 42 58  00 D6 B4 CB  47 95 32 68 

Data Storage

The datagrams are received and if the templates are not present, then the source ip/port combination is used to see if there is saved template information. With template information (from either source), the data can be extracted. Clock skew is detected — if the time of transmission is more than one minute from the time of reception, it is assumed that the sender's clock is set incorrectly. In this case, the clock skew can be calculated and all times in the packet can be adjusted accordingly.

The reception records are combined with the station information data and inserted into a database.

Data Retrieval

Reception records can be retrieved from the database by performing an http GET/POST on the URL http://retrieve.psk.gladstonefamily.net/query?senderCallsign=requestedcall

This will return the last 100 reception records for the requested callsign, but for no more than 6 hours in any event. The default set of fields will be returned (receiverCallsign, receiverLocator, senderCallsign, frequency, flowStartSeconds). [Comments invited on query interface.]

The field list can be explicitly requested by using the 'fields' parameter on the URL. The default is 'fields=receiverCallsign,receiverLocator,senderCallsign,frequency,flowStartSeconds'.

The format of the returned information will be an XML document [is this a useful format?]. A sample XML document is:

<receptionReports>
  <receptionReport receiverCallsign="KB1MBX" receiverLocator="FN42hn" senderCallsign="N1DQ" frequency=14070987 flowStartSeconds=xxxxxxx />
  <receptionReport receiverCallsign="ZZ1ZZ" receiverLocator="GG99" senderCallsign="N1DQ" frequency=14070987 flowStartSeconds=xxxxxxx />
</receptionReports>

An example can be seen for N1DQ.

Users are encouraged to retrieve reception data no more often than once every five minutes. If the display of reception data is integrated into the PSK transmitting application, then the timing can be optimized — do a retrieval five minutes after each transmission of 'de callsign callsign' (provided that it is more than five minutes since the previous retrieval). The purpose of the five minute delay is to allow all the receivers to make their reports.

Data Display

The data can be displayed in any way that the end software developer desires. As part of this project, there will be a simple google map based application that displays the recption data for a given callsign in a browser window. There is a prototype map display.

Miscellaneous

Why five minutes?

I am concerned about the data rate if this takes off. There could be (say) 10,000 clients submitting data continuously, and at a 300 second interval, this comes to 30 packets per second. This will turn into a significant database load if there are any significant number of reports per datagram.

Hopefully most of the clients will implement an adaptive use of the retrieval mechanism otherwise that will prove a huge load. You may think that 10,000 is a lot of clients, but that is not clear to me. This application gives a purpose to people to just leave their rigs turned on and tuned to one of the PSK frequencies and logging away. In fact, I'd like on of the PSK clients to be able to control the frequency by time of day!

Why use IPFIX?

IPFIX is a protocol designed for logging data associated with network traffic. The packets are self describing, and this allows for future upgrades in the PSK reporting system without huge difficulties. Also, I prefer not to reinvent the wheel!

Notes

Callsigns are case insensitive.

The callsign to be reported is the entire string that is repeated after the 'DE' this includes any prefixes and suffixes. It is important that all authors be consistent as it allows the transmitter of the CQ to query the database for their particular string.

Testing

For testing purposes, there is a listener on psk.gladstonefamily.net on port 14739 which will analyze the received packet. The results of the analysis can be seen at packet analysis. The last few received packets are displayed.
Philip Gladstone