MeerKAT data in the archive are stored in a unique data format known as MeerKAT Visibility Format (MVF). MeerKAT science data are stored in MVF version 4. Early commissioning and science data from MeerKAT-16 are in MVF version 3 but very few users will ever encounter these files.
Access is facilitated by the katdal package. This package will seamlessly detect any prior data format and provide a standard interface to access visibility, sensor and metadata. Note that katdal also has various fixes built into it to account for errors found retrospectively in the data (e.g., timestamp and frequency errors). It is advised to always use the latest version of katdal for this reason.
Conversion to Measurement Set (MS) format
Most external data reduction pipelines for MeerKAT make use of MS files. Conversion can be requested on the archive interface.
The measurement set format does not support continuous scans so katdal will need to be used to access non-standard observation data.
On occasion, one may want to download a small subset of data for quick checks, without waiting on the SARAO archive data transfer queue. In this case, it is possible to obtain the direct link to the rdb file (see article on archive), and run mvftoms on a local machine across the network.
Â
#katdal can be installed using pip
pip install katdal
#get the link to the rdb file from the archive <katdal link>
#you can use various selection options, including binning in time or channels
mvftoms.py --target J1939-6342 --flags '' --dumptime 60 -o newms.ms <katdal link>
Using katdal to directly access data objects in the archive
The OBIT-based SDP pipeline (and OBIT itself) uses the native data format so no conversion is necessary. On occasion, users may need to write their own data access and reduction scripts, e.g. with HI intensity mapping. Please have a look at the documentation on data chunking to understand how the data is physically stored and accessed and optimise retrieval speeds.
Below is an example of how to access and plot visibilities directly from the archive. First copy the rdb link with token to clipboard and paste into your code. We have elected to use a delay calibration for this example.
import katdal
from astropy.time import Time
from matplotlib.dates import DateFormatter
#get the link to the rdb file from the archive
link = 'https://archive-gw-1.kat.ac.za/1639440394/1639440394_sdp_l0.full.rdb?token=eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9.eyJpc3MiOiJrYXQtYXJjaGl2ZS5rYXQuYWMuemEiLCJhdWQiOiJhcmNoaXZlLWd3LTEua2F0LmFjLnphIiwiaWF0IjoxNjM5NTczNDQ0LCJwcmVmaXgiOlsiMTYzOTQ0MDM5NCJdLCJleHAiOjE2NDAxNzgyNDQsInN1YiI6InNoYXJtaWxhQHNhcmFvLmFjLnphIiwic2NvcGVzIjpbInJlYWQiXX0.H2yEdZY8BEJDAgMR2XyBf3r6IcHQ2E2hCmhaUmKJgqchy_okOxxNr5XLpHFTpNSv9iitvFQX40B1_ioLr1YvIQ'
data = katdal.open(link)
print(data)
Â
===============================================================================
Name: https://archive-gw-1.kat.ac.za/1639440394/1639440394_sdp_l0.full.rdb?token=eyJ0eXAiOiJKV1QiLCJhbGciOiJFUzI1NiJ9.eyJpc3MiOiJrYXQtYXJjaGl2ZS5rYXQuYWMuemEiLCJhdWQiOiJhcmNoaXZlLWd3LTEua2F0LmFjLnphIiwiaWF0IjoxNjM5NTczNDQ0LCJwcmVmaXgiOlsiMTYzOTQ0MDM5NCJdLCJleHAiOjE2NDAxNzgyNDQsInN1YiI6InNoYXJtaWxhQHNhcmFvLmFjLnphIiwic2NvcGVzIjpbInJlYWQiXX0.H2yEdZY8BEJDAgMR2XyBf3r6IcHQ2E2hCmhaUmKJgqchy_okOxxNr5XLpHFTpNSv9iitvFQX40B1_ioLr1YvIQ | 1639440394-sdp-l0 (version 4.0)
===============================================================================
Observer: Operator Experiment ID: 20211213-0027
Description: 'Delaycal'
Observed from 2021-12-14 02:06:46 SAST to 2021-12-14 02:12:13.127 SAST
Dump rate / period: 0.99961 Hz / 1.000 s
Subarrays: 1
ID Antennas Inputs Corrprods
0 m001,m002,m003,m004,m005,m006,m007,m008,m009,m010,m011,m013,m015,m017,m018,m019,m022,m025,m026,m027,m028,m029,m030,m032,m033,m034,m035,m037,m038,m039,m040,m041,m042,m043,m044,m045,m046,m047,m048,m049,m050,m051,m052,m053,m054,m055,m056,m057,m058,m059,m060,m061,m062,m063 108 5940
Spectral Windows: 1
ID Band Product CentreFreq(MHz) Bandwidth(MHz) Channels ChannelWidth(kHz)
0 UHF c544M1k 816.000 544.000 1024 531.250
-------------------------------------------------------------------------------
Data selected according to the following criteria:
ants=['m001', 'm002', 'm003', 'm004', 'm005', 'm006', 'm007', 'm008', 'm009', 'm010', 'm011', 'm013', 'm015', 'm017', 'm018', 'm019', 'm022', 'm025', 'm026', 'm027', 'm028', 'm029', 'm030', 'm032', 'm033', 'm034', 'm035', 'm037', 'm038', 'm039', 'm040', 'm041', 'm042', 'm043', 'm044', 'm045', 'm046', 'm047', 'm048', 'm049', 'm050', 'm051', 'm052', 'm053', 'm054', 'm055', 'm056', 'm057', 'm058', 'm059', 'm060', 'm061', 'm062', 'm063']
spw=0
subarray=0
-------------------------------------------------------------------------------
Shape: (327 dumps, 1024 channels, 5940 correlation products) => Size: 15.912 GB
Antennas: m001,m002,m003,m004,m005,m006,m007,m008,m009,m010,m011,m013,m015,m017,m018,m019,m022,m025,m026,m027,m028,m029,m030,m032,m033,m034,m035,m037,m038,m039,m040,m041,m042,m043,m044,m045,m046,m047,m048,m049,m050,m051,m052,m053,m054,m055,m056,m057,m058,m059,m060,m061,m062,m063 Inputs: 108 Autocorr: yes Crosscorr: yes
Channels: 1024 (index 0 - 1023, 544.000 MHz - 1087.469 MHz), each 531.250 kHz wide
Targets: 1 selected out of 1 in catalogue
ID Name Type RA(J2000) DEC(J2000) Tags Dumps ModelFlux(Jy)
0 J0408-6545 radec 4:08:20.38 -65:45:09.1 bfcal single_accumulation 327 29.68
Scans: 6 selected out of 6 total Compscans: 2 selected out of 2 total
Date Timerange(UTC) ScanState CompScanLabel Dumps Target
14-Dec-2021/00:06:46 - 00:06:57 0:slew 0:un_corrected 12 0:J0408-6545
00:06:58 - 00:09:04 1:track 0:un_corrected 127 0:J0408-6545
00:09:05 - 00:10:58 2:stop 0:un_corrected 114 0:J0408-6545
00:10:59 - 00:11:00 3:stop 1:corrected 2 0:J0408-6545
00:11:01 - 00:11:10 4:slew 1:corrected 10 0:J0408-6545
00:11:11 - 00:12:12 5:track 1:corrected 62 0:J0408-6545
Next, select a single baseline to examine the visibilities. Generate a spectrum.
Example plot generated in ipython from data accessed directly from the archive.
A waterfall plot showing the phases between M001 and M063 before and after delay calibration.