Ultra-Ultra-Fast Data Acquisition System for Coherent Synchrotron Radiation Based on Superconducting Terahertz Detectors 4 th International Particle Accelerator Conference, 12-17 May 2013. Shanghai M. Caselle, M. Balzer, S. Cilingaryan, M. Hofherr, V. Judin, A. Kopmann, K. ll in, A. Menshikov, A.-S. Müller, N. J. Smale, P. Thoma, S. Wuensch, M. Siegel, M. Weber M. Caselle Fast GS/s input stages 42 psec KIT Universität des Landes Baden-Württemberg und nationales Forschungszentrum in der Helmholtz-Gemeinschaft www.kit.edu
THz radiation at ANKA bunches ANKA parameters: Normal mode X-ray incoherent radiation Circumference: 110.4 m RF-system: 500 MHz Revolution time: 368 ns Harmonic number: 184 Revolution frequency: 2.71MHz Beam energy: 2.5 GeV - low alpha mode: 1.3GeV Bunch length (low alpha mode): few ps Bunch spacing: 2ns Low alpha mode Reference: A.-S. Müller, et al. MOPEA019, these proceedings THz coherent synchrotron radiation (CSR) A.-S. Müller, et al. Observation of Coherent THz Radiation from the ANKA and MLS Storage Rings with a Hot Electron Bolometer. (TU5RFP027), 2009. 23rd Particle Accelerator Conference PAC09 Vancouver, Canada. 2
Ultra-fast YBCO THz detectors for picosecond synchrotron pulses Detector response (mv) 0.3 µm Nanometer-sized YBCO detectors in a high-speed readout system operated > 77 K 1000 800 Ultra-wide dynamic range Linear fit Picosecond time resolution 600 400 200 DR > 30 db 0 0.0 0.3 0.6 0.9 1.2 1.5 Average THz Power (mw) P. Thoma et al., Applied Physics Letters, 101, 142601, 2012 P. Probst et al., Physical Review B, 85, 174511, 2012 3 Institut für Mikro- und Nanoelektronische Systeme
T/H - ADC CSR - THz experimental setup Wideband Low Noise Amplifier High throughput readout board Cryogenic temp (7 80K) DDR3 Ambient temp CSR DMA/PCIe YBCO LNA FPGA PC/DAQ ANKA IR1 Terahertz YBCO detector Pulse sampling board Detector system based on a superconducting NbN/YBCO ultra-fast bolometer with a bandwidth up to 1 THz. P. Probst. APPLIED PHYSICS LETTERS 98, 043504 (2011) Pulse width (output of LNA) few tens of picoseconds Incoming pulse frequency of 500MHz synchronous with ANKA RF system (resp. CSR bunch emission) GOAL: Turn-by-Turn and Bunch-by-Bunch measurement of CSR intensity. 4
Fast pulse sampling board (basic concept) Wideband power splitter 1:4 Sampling channel 500MS/s ADC S 1 12 bit @ 500MHz T/H ADC S 2 FPGA Analog pulse (from LNA) ADC S 3 ADC S 4 DMA PCIe Picosecond delay chip d d d d 24Gb/s Streaming (12 bit * 4 samples * 500 MHz) Anka clock PLL T S0 T S1 T S2 T S3 S 2 S 3 Pulse sampling Clean jitter PLL (RMS = 400 fs) Low RMS jitter Fanout buffer S1 S 4 7
Fast pulse sampling board (basic concept) 1:4 Power splitter DC-50GHz Wideband power splitter 1:4 Sampling channel 500MS/s ADC 12 bit @ 500MHz Analog signal T/H ADC ADC FPGA ADC DMA PCIe Picosecond delay chip Anka Wideband PLL clock transmission line Clean (PCB) jitter PLL T S0 d T S1 Picosecond timing control (components and PCB design) Low RMS jitter Fanout buffer d T S2 Very d low-noise d PCB T S3 design 24Gb/s Streaming (12 bit * 4 samples * 500 MHz) High data throughput readout system 8
Power splitter DC - 50 GHz, PCB layout Carried board Zo = 50Ω Outputs Zo = 100Ω Input Power splitter (under test) Zo = 50Ω Input Zo = 100Ω Zo = 100Ω Zo = 50Ω 1.4 cm Outputs 6.36 db Direct transfer (S21) Insulation between Output 2 and 4 DC-65GHz Rosenberger connectors Substrate: Duroid5880 Reflection (S11) Simulation 9
Fast sampling prototype board Acquires one sample in the peaking time region of each THz pulse (resp. CSR bunch emission). Analog Signal from LNA Single sampling channel T/H 500MS/s ADC FPGA Track&Hold Amplifier Fast ADC 500MS/s Anka clock 500MHz d Picosecond delay chip Analog input Picosec delay chip RF ANKA clock PCB Roger 4003 substrate and high-speed CPW transmission lines (BW: 50GHz) Separation between Analog and digital GNDs Ad-hoc RF-filters on critical components Vias fences and guard-ring layout techniques Low RMS time jitter components selected RMS = ~ 2mV 10
Sampling prototype board ANKA Test Beam Train 1 Train 2 ANKA CSR (with NbN HEB detector) ANKA 184 bunches revolution time 368ns Pilot bunch Train 3 Multi-bunch filling pattern 2 nsec ANKA CSR (long observation time with YBCO detector) Tested with YBCO and NbN THz detectors Simultaneous turn by turn monitoring of all 184 buckets Continuous data stream (all bunches all turns) without dead time Measurements of CSR oscillation amplitude Recording & analysis of time evolution of each bunch A.-S. Müller, et al. MOPEA019, these proceedings 11
Four sampling channels board 4 sampling channels board has been produced recently electrically tested PCIe connection Fast ADC (500MS/s) psec delay chip ANKA revolution clock RF filters Track-and-hold CPW trasm. line PLL and clock fanout ANKA RF clock Sampling channel 12 PCB made by ROGER 4003 consisting in 10 layers metal Stack-up Guard-ring and via fences for cross-talk, EMI reductions
Conclusion and what s next Fast sampling prototype board dynamical range of ± 800mV with RMS = 2 mv Very low Deterministic jitter sampling time accuracy of 3 ps The Random jitter estimated to be < 500 fs High data throughput readout board based on PCIe-DMA (16Gb/s) already available and used for beam studies Four sampling channels board developed and produced First test beam test foreseen in the summer High data throughput readout board based on PCIe-DMA (32Gb/s) under developing 13
Thank you for your attention. 14
High-band CPW transmission line Analog signal DC-50 GHz Loss= 38dB/m Z0 = 50.7 Ω 16
Differential CPW transmission line Digital signal, T/H clock distribution f=500mhz 17
Differential Stripline (TL) Digital signal, ADC clock distribution f=500mhz 18
Mean Time and voltage jitters in high speed sampling board Jitter: The deviation from the ideal timing of an event. Reference point Histogram Jitter is composed of: both deterministic and Gaussian (random) content. Deterministic jitter (DJ) Random (Gaussian) Jitter (RJ) DJ cross talk, EMI radiation, Noisy reference plane, Simultaneous Switching Outputs (SSO), etc. Like all physical phenomena, some level of randomness to edge deviation occurs in all electronic signals. Jitter with non-gaussian probability density function Peak-Peak Standard Deviation (1s) 19
Chipset root port Optical/Electrical X8 lanes @ 5Gb/s PCI Express + DMA (KIT_ipcore) SerDes input stage (KIT_ipcore) THz Radiation High-throughput readout system & FPGA architecture PC DAQ 4 FPGA internal architecture 48 lanes @ 500Mbit/s CPU memory memory memory 4 FIFO FIFO User bank register FSM Master control On-line parallel data processing DDR interface (KIT_ipcore) FIFO Detector Control... Fast Pulse Sampling Daughter board 24Gb/s THz detector & 60GHz LNA DDR3 memory (800MHz @ 64bit) Remote Detector Control PCIe-Bus Master DMA readout architecture operating with 8 lanes PCIe @ Gen2 (250MHz) Virtex6 - floorplan DDR3 interface PCIe Multi-port high speed DDR3 interface @ 51Gb/s PCI Express/DMA Linux 32-64 bits driver Integration in the parallel GPU framework SerDes & input stage DMA 20
Data Transfer (Mbit/s) GPU/CPU Applications Data decoding and consistency check DMA driver Optical/Electrical X8 lanes @ 5Gb/s GTX Transceivers X8 @ 5Gb/s Xilinx Integrated block for PCI Express GEN 2 Custom PCIe DMA Interface IN port OUT port PCIe-Bus Master DMA readout architecture Software layers Xilinx IP-core 4 4 DMA Engines User bank register 30000 I/O interface logic FIFO RD Packet control FSMs I/O interface logic FIFO WR Packet control FSMs FPGA temp. & voltage control Data out [0..127] Data valid Busy_logic Clock_out Data in [0..127] WR_EN Back-pressure Clock_in DMA operating @ 250MHz Comparison (NW-DMA vs. KIT-DMA) Bus Master DMA operating with 8lanes PCIe @ Gen2 (250MHz) 25000 20000 Two individual engines for write/read from FPGA (User logic) to PC centre memory IN and OUT FIFO-like interface (for User logic) FIFO used to decouple the time domain between DMA and User custom logic 15000 10000 5000 0 PC Mem --> FPGA (NW) FPGA --> PC Mem (NW) PC Mem --> FPGA (Michele) Data valid for FPGA X58 --> PCIe Mem chipset (Michele) 0 10000 20000 30000 40000 50000 60000 70000 Packet size in Byte 21
Conclusion & What s next 4 channels Fast Pulse shape Sampling board is completed High density Samtec connectors CPW trans. line 10 layers metal Stack-up Analog & RF GND Analog P/S Analog input Shield Digital P/S Control signals Diff. Stripline Trans. line Digital & RF GND First board available mid of February Diff. CPW trans. line Test beam planned summer 2013 The commissioning for the experimental station 2013. 22
Amplitude (mv) Fast sampling prototype board time characterization Waveform generator Pulse shape of 200 ps @ freq. 500MHz Pulse sampling prototype board High-throughput FPGA readout board Probing (Oscilloscope BW: 2.5GHz 20GS/s) PCIe -DMA 16Gb/s DAQ/PC Fall pulse acquired by equivalent time sampling method (both oscilloscope and sampling prototype board) (When the pulse occurs, a minimum settable sampling time is added by the delay chip in order to move the sampling time with a minimum time step) 500 400 300 Pulse shape vs Samples Pulse measured with oscilloscope (Bandwidth: 2.5GHz - 20GS/s) The pulse profile was obtained by 32 samples, one for each delay chip setting. We recorded the pulse fall time by 30 samples inside 90 ps a time accuracy of ~ 3 ps 200 100 0 Samples (Sampling prototype board) 90 ps High linearity between the sampling time (set by FPGA) and the real sampling time Very low deterministic jitter on the board -100 0.8 1 1.2 1.4 1.6 1.8 2 Time in (sec) x 10-10 23
Amplitude (V) Picosecond time jitter estimation between bunches F RF_ANKA Jitter ( 20psec) Jitter (+ 20psec) Output Main Amplifier Particle bunch sync to RF ANKA clock Clean jitter clock from PLL S 2 S 1 S3 S4 Output samples S 1 S 2 S 3 S 4 S 2 S 3 S 4 Precise PLL clock used as trigger for samples S 1 0 Time (psec) Procedure: Fast reconstruction of the analog pulse by the 4 samples (FPGA or GPU) Measuring of the peak pulse amplitude Measuring of the time jitter between bunches by the position of the samples in the reconstructed pulse 24