HW-Time-Slice Firmware-Filter Software vx2740 Steering Module Reports MVM MVM Vexos MVM-Bug listing MVM TRIUMF Local DS Prototype DS Cryogenic For Shifters BCIT-31 ChronoBox Run Operation DS-DAQ
  CERN DS-Proto0 read-only backup, Page 7 of 8  Not logged in ELOG logo
ID Date Author Type Category Subject
  28   10 Apr 2019 05:40 ThomasRoutineGeneralGeneral work - day 3 at CERN
Several points:

1) Gave a series of tutorial on DAQ to DS people yesterday and today.  Got a bunch of feedback, which I will pass on when I'm back at TRIUMF.

2) The computer ds-proto-daq was offline when I got in to lab this morning.  Hmm, not clear what is wrong with computer.  Didn't happen the 
first day.  Maybe another power blip?  Maybe we need a UPS for this DAQ machine, to protect it from power blips.

3) Using instructions from Luke, reconfigured the CDM to use the clock from the chronobox. 

4) Added scripts for putting the chronobox and the CDM into a sensible state.  Scripts are 

/home/dsproto/online/dsproto_daq/setup_chronobox.sh
/home/dsproto/online/dsproto_daq/setup_cdm.sh

The scripts need to be rerun whenever the chronobox or VME crate are power cycled.

5) Fix some bugs and added some new plots to online monitoring.  In particular, added a bunch of plots related to the chronobox data.

6) Found some problems with monitoring of chronobox trigger primitives, which I passed onto Bryerton.
  27   08 Apr 2019 08:31 ThomasRoutineGeneralGeneral work - day 1 at CERN
Notes on day:

1) Fixed the problem with the network interfaces.  Now the computer boots with the correct network configuration; outside world visible 
and private network on.

2) fan tray on VME crate seemed to be broken.  Got another VME crate from pool and installed it.  This VME crate seems to be working 
well. 

3) Recommissioned the DAQ setup.  Found a couple small bugs related to the V1725 self-trigger logic.  Fixed those and the V1725 self-
triggers seem to be working correctly.

4) Tried to install new CDM from TRIUMF (with ssh access), but clocks didn't stay synchronized.  Will bring module back to TRIUMF.

5) Added some code to V1725 frontend for clearing out the ZMQ buffers of extra events at the end of the run.  This is to protect 
against the case where the chronobox is triggering too fast for the V1725s.
  26   03 Apr 2019 15:31 ThomasRoutineSoftwaretest of elog
The last elog didn't go out cleanly.  Modified the elogd.cfg to point to the proxy.  Try again.
  25   03 Apr 2019 15:11 ThomasRoutineSoftwareCERN SSO proxy for ds-proto-daq
Pierre and I got the CERN proxy setup for the Darkside prototype.

Using your CERN single-sign-on identity, you should be able to login to this page

https://m-darkside.web.cern.ch/

and see our normal MIDAS webpage.

The CERN server is proxying the port 80 on ds-proto-daq.  You can also see all the other services through the
same page:

elog:
https://m-darkside.web.cern.ch/elog/DS+Prototype/

chronobox:
https://m-darkside.web.cern.ch/chronobox/

js-root:
https://m-darkside.web.cern.ch/rootana/


_________________________________

Technical details

1) We followed these instructions for creating a SSO-proxy:

https://cern.service-now.com/service-portal/article.do?n=KB0005442

We pointed the proxy to port 80 on ds-proto-daq

2) On ds-proto-daq, we needed to poke a hole through the firewall for port 80:


firewall-cmd --permanent --add-rich-rule="rule family="ipv4" source address="188.184.28.139/32" port
protocol="tcp" port="80" accept"
firewall-cmd --reload

[root@ds-proto-daq ~]# firewall-cmd --list-all
public (active)
...
	rule family="ipv4" source address="188.184.28.139/32" port port="80" protocol="tcp" accept

This firewall rule is pointing to some particular IP that seems to be the proxy side of the server:

[root@ds-proto-daq ~]# host 188.184.28.139
139.28.184.188.in-addr.arpa domain name pointer oostandardprod-7b34bdf1f3.cern.ch.

It is not clear if this particular IP will be stable in long term.

3) We needed to modify mhttpd so it would serve content to hosts other than localhost.  So changed mhttpd
command from 

mhttpd -a localhost -D

to 

mhttpd -D
  24   28 Mar 2019 02:18 PierreConfigurationTriggerTest
  23   27 Mar 2019 14:04 PierreConfigurationTriggerTime stamp sync
The ChronoBox latest FW is loaded. Let tme know if this is what the chronobox should look like in term of registers.

Are we monitoring the PLL Lock Loss  (odb: /DEAP Alarm ?)

Here is the dump of the 5 first sync triggers without any physics trigger behind.

1 0x830b577 0x17d7855 0x6b33d22 0x6b33d22 8.992791 
2 0x830b577 0x17d7855 0x6b33d22 0x6b33d22 8.992791 
3 0x830b577 0x17d7855 0x6b33d22 0x6b33d22 8.992791 
4 0x830b577 0x17d7855 0x6b33d22 0x6b33d22 8.992791 

1 0x830b581 0x2faf097 0x535c4ea 0x535c4ea 6.992792 
2 0x830b581 0x2faf097 0x535c4ea 0x535c4ea 6.992792 
3 0x830b581 0x2faf095 0x535c4ec 0x535c4ec 6.992792 
4 0x830b581 0x2faf095 0x535c4ec 0x535c4ec 6.992792 

1 0x830b587 0x47868d7 0x3b84cb0 0x3b84cb0 4.992792 
2 0x830b587 0x47868d7 0x3b84cb0 0x3b84cb0 4.992792 
3 0x830b587 0x47868d7 0x3b84cb0 0x3b84cb0 4.992792 
4 0x830b587 0x47868d7 0x3b84cb0 0x3b84cb0 4.992792 

1 0x830b5a1 0x5f5e119 0x23ad488 0x23ad488 2.992794 
2 0x830b5a1 0x5f5e119 0x23ad488 0x23ad488 2.992794 
3 0x830b5a1 0x5f5e117 0x23ad48a 0x23ad48a 2.992794 
4 0x830b5a1 0x5f5e117 0x23ad48a 0x23ad48a 2.992794 

1 0x830b5ab 0x7735959 0xbd5c52 0xbd5c52 0.992795 
2 0x830b5ab 0x7735959 0xbd5c52 0xbd5c52 0.992795 
3 0x830b5ab 0x7735959 0xbd5c52 0xbd5c52 0.992795 
4 0x830b5ab 0x7735959 0xbd5c52 0xbd5c52 0.992795 


With the physics triggers:
1 0x830b5c1 0x1834497f 0xeffc6c42 0x100393be 21.493591 
2 0x830b5c1 0x1834497f 0xeffc6c42 0x100393be 21.493591 
3 0x830b5c1 0x1834497f 0xeffc6c42 0x100393be 21.493591 
4 0x830b5c1 0x1834497f 0xeffc6c42 0x100393be 21.493591 
1 0x830b5d3 0x1834abb5 0xeffc0a1e 0x1003f5e2 21.495601 
2 0x830b5d3 0x1834abb5 0xeffc0a1e 0x1003f5e2 21.495601 
3 0x830b5d3 0x1834abb5 0xeffc0a1e 0x1003f5e2 21.495601 
4 0x830b5d3 0x1834abb5 0xeffc0a1e 0x1003f5e2 21.495601 
1 0x830b5dd 0x18350d7f 0xeffba85e 0x100457a2 21.497603 
2 0x830b5dd 0x18350d7f 0xeffba85e 0x100457a2 21.497603 
3 0x830b5dd 0x18350d7f 0xeffba85e 0x100457a2 21.497603 
4 0x830b5dd 0x18350d7f 0xeffba85e 0x100457a2 21.497603 
1 0x830b5e3 0x183585e7 0xeffb2ffc 0x1004d004 21.500068 
2 0x830b5e3 0x183585e7 0xeffb2ffc 0x1004d004 21.500068 
3 0x830b5e3 0x183585e7 0xeffb2ffc 0x1004d004 21.500068 
4 0x830b5e3 0x183585e7 0xeffb2ffc 0x1004d004 21.500068 
1 0x830b5ed 0x1a10bb9b 0xee1ffa52 0x11e005ae 23.991535 
2 0x830b5ed 0x1a10bb9b 0xee1ffa52 0x11e005ae 23.991535 
3 0x830b5ed 0x1a10bb9b 0xee1ffa52 0x11e005ae 23.991535 
4 0x830b5ed 0x1a10bb9b 0xee1ffa52 0x11e005ae 23.991535 
1 0x830b5fb 0x1a111f65 0xee1f9696 0x11e0696a 23.993578 
2 0x830b5fb 0x1a111f65 0xee1f9696 0x11e0696a 23.993578 
3 0x830b5fb 0x1a111f65 0xee1f9696 0x11e0696a 23.993578 
4 0x830b5fb 0x1a111f65 0xee1f9696 0x11e0696a 23.993578 
1 0x830b607 0x1a118115 0xee1f34f2 0x11e0cb0e 23.995577 
2 0x830b607 0x1a118115 0xee1f34f2 0x11e0cb0e 23.995577 
3 0x830b607 0x1a118115 0xee1f34f2 0x11e0cb0e 23.995577 
4 0x830b607 0x1a118115 0xee1f34f2 0x11e0cb0e 23.995577 
1 0x830b611 0x1a11e2c7 0xee1ed34a 0x11e12cb6 23.997577 
2 0x830b611 0x1a11e2c7 0xee1ed34a 0x11e12cb6 23.997577 
3 0x830b611 0x1a11e2c7 0xee1ed34a 0x11e12cb6 23.997577 
4 0x830b611 0x1a11e2c7 0xee1ed34a 0x11e12cb6 23.997577 


The ZMQ0 banks:
#banks:5 Bank list:-ZMQ0W200W201W202W203-
Bank:ZMQ0 Length: 40(I*1)/10(I*4)/10(Type) Type:Unsigned Integer*4
   1-> 0x000a5f1c 0x000000c4 0x00000001 0x0ebd5273 0x00000001 0x00010001 0xffffffff 0x00000000 
   9-> 0xffff0000 0x00000000 
------------------------ Event# 10 ------------------------
#banks:5 Bank list:-ZMQ0W200W201W202W203-
Bank:ZMQ0 Length: 40(I*1)/10(I*4)/10(Type) Type:Unsigned Integer*4
   1-> 0x000a5f1d 0x000000c5 0x00000001 0x0ebd5279 0x00000001 0x00010001 0xffffffff 0x00000000 
   9-> 0xffff0000 0x00000000 
[dsproto@ds-proto-daq dsproto_daq]$ 
  22   27 Mar 2019 08:57 PierreConfigurationHardwareCERN setup
Found that the Trigger out from the CB is on output1
Trigger / Not used
Clock   / Not used

Set frontend to use NIM in/out instead of TTL as there is a nice
NIM-TTL-NIM adaptor CAEN Nim module available
  21   11 Mar 2019 15:27 ThomasRoutineTriggerNew chronobox firmware; run start/stop implemented
> f) However, I found that the frontend program still consistently failed with this error when the trigger rate
> was above the maximum sustainable:
> 
> Deferred transition.  First call of wait_buffer_empty. Stopping run
> [feov1725MTI00,ERROR] [v1725CONET2.cxx:685:ReadEvent,ERROR] Communication error: -2
> [feov1725MTI00,ERROR] [feoV1725.cxx:654:link_thread,ERROR] Readout routine error on thread 0 (module 0)
> [feov1725MTI00,ERROR] [feoV1725.cxx:655:link_thread,ERROR] Exiting thread 0 with error
> Stopped chronobox run; status = 0
> Segmentation fault

I sort of 'fix' this problem.  There is some sort of problems between the V1725 readout	thread and the system call
to esper-tools to stop the run.  Some collision between the system resources for these two calls causes the readout
thread ReadEvent call to fail.  I 'fix' the problem by adding in the end_of_run part a 500us pause of the readout
threads before I make the system call to esper-tool.  

Odd.  In principle I think that the system calls and the readout threads are running on different cores.  So not
clear what the problem was.  Should figure out better diagnosis and fix problem properly.
  20   11 Mar 2019 12:30 ThomasRoutineTriggerNew chronobox firmware; run start/stop implemented
a) Bryerton implemented new version of the firmware.  New features:

1) run start/stop state
2) at run start 6 events are sent with 200ms separation
3) there is a greater set of counters about trigger, as well as configuring the trigger.

b) chronobox webpage can be seen here:

https://ds-proto-daq.triumf.ca/chronobox/

The mod_tdm page gives configuration of run state and trigger.  In particular

- button to start/stop run
- button to do manual trigger
- configure which channels are TOP and which are BOTTOM
- configure the number of TOP or BOTTOM channels that need to figure
- DecisionType: true= TOP AND BOTTOM groups must fire; false = TOP OR BOTTOM groups can fire.

c) Start and stop run can also be done on command line with esper tool:

esper-tool write -d true 192.168.1.3 mod_tdm run
esper-tool write -d false 192.168.1.3 mod_tdm run

d) I modified the V1725 frontend to integrate the chronobox start/stop.

At run start
- configure V1725s
- start chronobox with the command-line esper-tool call

At end run
- add deferred transition function which stops run (with esper-tool), then checks whether all the events have
cleared from ring buffers.
- once ring buffers are cleared, finish stopping the V1725s.

In the long run should somehow do the start/stop commands with some http post command, rather than command line
to esper-tool.

e) With the start/stop run could start to compare the timestamps of V1725s and confirm that they matched (ie,
all V1725s got reset at the same time).  Also, all the events are cleared from buffers.

f) However, I found that the frontend program still consistently failed with this error when the trigger rate
was above the maximum sustainable:

Deferred transition.  First call of wait_buffer_empty. Stopping run
[feov1725MTI00,ERROR] [v1725CONET2.cxx:685:ReadEvent,ERROR] Communication error: -2
[feov1725MTI00,ERROR] [feoV1725.cxx:654:link_thread,ERROR] Readout routine error on thread 0 (module 0)
[feov1725MTI00,ERROR] [feoV1725.cxx:655:link_thread,ERROR] Exiting thread 0 with error
Stopped chronobox run; status = 0
Segmentation fault


Next steps:

1) Fix the seg-fault for high rate running.
2) More detailed timestamp checking, with cm_msg(ERRORs)
3) Bryerton is now working on the event FIFO on chronobox.  That will be next thing to integrate.
  19   06 Mar 2019 18:19 Bryerton ShawRoutineTriggerSetup of chronobox
The SDcard is currently required for operation of the device, the ext3/4 filesystem will immediately fail upon removal.


> Summary of setup of chronobox on network (mostly done by Pierre):
> 
> 1) Hook up USB connection from chronobox to ds-proto-daq.  Start serial-USB connection by doing
> 
> /home/dsproto/online/dsproto_daq/serialusb
> 
> can login through serial link.  Added the ssh key for dsproto@ds-proto-daq to the authorized_keys file on
> chronobox linux.
> 
> ctrl-a, ctrl-x to stop picocom
> 
> 2) On ds-proto-daq, setup dhcpd server.  Configuration in this file
> 
> /etc/dhcp/dhcpd.conf
> 
> dhpcd is bond to the second NIC only.  Configure DHCP to give IP 192.168.1.3 to chronobox.  In /etc/hosts set
> 192.168.1.3 to have hostname chronobox
> 
> Power cycle chronobox; it successfully gets IP:
> 
> [dsproto@ds-proto-daq dsproto_daq]$ ping chronobox
> PING chronobox (192.168.1.3) 56(84) bytes of data.
> 64 bytes from chronobox (192.168.1.3): icmp_seq=1 ttl=64 time=0.180 ms
> 
> 3) Can see the chronobox webpage locally as
> 
> http://chronobox:8080
> 
> 4) Some esper tool to access data:  did following to setup esper
> 
>   734  yum install python2-pip
>   735  pip install esper-tool
>   736  yum install ncurses
>   738  yum install ncurses-devel
>   739  pip install esper-tool
> 
> Can then connect to the chronobox by doing
> 
> esper-tool interactive http://chronobox:8080
> 
> This is as much as I understand at this point... more exploring now...
> 
> 5) Pierre unplugged the SD card to take it back to Bryerton's room.  But I guess this was bad.
> 
> lots of errors on serialUSB link now and the webpage doesn't work anymore:
> 
> [ 1246.665269] blk_partition_remap: fail for partition 2
> Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.904770] EXT4-fs error: 339 callbacks suppressed
> [ 1246.679162] blk_partition_remap: fail for partition 2
> Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.904780] EXT4-fs error (device mmcblk0p2):
> ext4_find_entry:1437: inode #2: com[ 1246.692884] blk_partition_remap: fail for partition 2
> m syslogd: reading directory lblock 0
> [ 1246.709020] blk_partition_remap: fail for partition 2
> Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.929199] blk_partition_remap: fail for partition 2
> [ 1246.717394] blk_partition_remap: fail for partition 2
> Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.942971] EXT4-fs error (device mmcblk0p2):
> ext4_find_entry:1437: inode #2: com[ 1246.731255] blk_partition_remap: fail for partition 2
> m syslogd: reading directory lblock 0
> Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.953760] blk_partition_remap: fail for [ 1246.747337]
> blk_partition_remap: fail for partition 2
> partition 2
> Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.967532] EXT4-fs error (device mmcblk0p2):
> ext4_find_entry:1437: [ 1246.763481] blk_partition_remap: fail for partition 2
> inode #2: comm syslogd: reading directory lblock 0
> Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.978320] blk_partition_rem[ 1246.779511] blk_partition_remap:
> fail for partition 2
> ap: fail for partition 2
> Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.992091] EXT4-fs error (device mmcblk0p2): ext4_find[
> 1246.795586] blk_partition_remap: fail for partition 2
> _entry:1437: inode #2: comm syslogd: reading directory lblock 0
> Jan  1 00:20:46 buildroot user.warn kernel: [ 1243.002885] blk_[ 1246.811671] blk_partition_remap: fail for
> partition 2
> partition_remap: fail for partition 2
> Jan  1 00:20:46 buildroot user.crit kernel: [ 1243.016656] EXT4-fs error (device mmcblk0p[ 1246.827750]
> blk_partition_remap: fail for partition 2
  18   06 Mar 2019 15:10 Thomas, PierreRoutineTriggerSetup of chronobox
Summary of setup of chronobox on network (mostly done by Pierre):

1) Hook up USB connection from chronobox to ds-proto-daq.  Start serial-USB connection by doing

/home/dsproto/online/dsproto_daq/serialusb

can login through serial link.  Added the ssh key for dsproto@ds-proto-daq to the authorized_keys file on
chronobox linux.

ctrl-a, ctrl-x to stop picocom

2) On ds-proto-daq, setup dhcpd server.  Configuration in this file

/etc/dhcp/dhcpd.conf

dhpcd is bond to the second NIC only.  Configure DHCP to give IP 192.168.1.3 to chronobox.  In /etc/hosts set
192.168.1.3 to have hostname chronobox

Power cycle chronobox; it successfully gets IP:

[dsproto@ds-proto-daq dsproto_daq]$ ping chronobox
PING chronobox (192.168.1.3) 56(84) bytes of data.
64 bytes from chronobox (192.168.1.3): icmp_seq=1 ttl=64 time=0.180 ms

3) Can see the chronobox webpage locally as

http://chronobox:8080

4) Some esper tool to access data:  did following to setup esper

  734  yum install python2-pip
  735  pip install esper-tool
  736  yum install ncurses
  738  yum install ncurses-devel
  739  pip install esper-tool

Can then connect to the chronobox by doing

esper-tool interactive http://chronobox:8080

This is as much as I understand at this point... more exploring now...

5) Pierre unplugged the SD card to take it back to Bryerton's room.  But I guess this was bad.

lots of errors on serialUSB link now and the webpage doesn't work anymore:

[ 1246.665269] blk_partition_remap: fail for partition 2
Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.904770] EXT4-fs error: 339 callbacks suppressed
[ 1246.679162] blk_partition_remap: fail for partition 2
Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.904780] EXT4-fs error (device mmcblk0p2):
ext4_find_entry:1437: inode #2: com[ 1246.692884] blk_partition_remap: fail for partition 2
m syslogd: reading directory lblock 0
[ 1246.709020] blk_partition_remap: fail for partition 2
Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.929199] blk_partition_remap: fail for partition 2
[ 1246.717394] blk_partition_remap: fail for partition 2
Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.942971] EXT4-fs error (device mmcblk0p2):
ext4_find_entry:1437: inode #2: com[ 1246.731255] blk_partition_remap: fail for partition 2
m syslogd: reading directory lblock 0
Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.953760] blk_partition_remap: fail for [ 1246.747337]
blk_partition_remap: fail for partition 2
partition 2
Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.967532] EXT4-fs error (device mmcblk0p2):
ext4_find_entry:1437: [ 1246.763481] blk_partition_remap: fail for partition 2
inode #2: comm syslogd: reading directory lblock 0
Jan  1 00:20:46 buildroot user.warn kernel: [ 1242.978320] blk_partition_rem[ 1246.779511] blk_partition_remap:
fail for partition 2
ap: fail for partition 2
Jan  1 00:20:46 buildroot user.crit kernel: [ 1242.992091] EXT4-fs error (device mmcblk0p2): ext4_find[
1246.795586] blk_partition_remap: fail for partition 2
_entry:1437: inode #2: comm syslogd: reading directory lblock 0
Jan  1 00:20:46 buildroot user.warn kernel: [ 1243.002885] blk_[ 1246.811671] blk_partition_remap: fail for
partition 2
partition_remap: fail for partition 2
Jan  1 00:20:46 buildroot user.crit kernel: [ 1243.016656] EXT4-fs error (device mmcblk0p[ 1246.827750]
blk_partition_remap: fail for partition 2
  17   06 Mar 2019 14:16 ThomasRoutineGeneralRetested the chronobox trigger logic
I retested the chronobox trigger generation:

1) Inserting moderate sized pulse into channel 8 of V1725-0.
2) Configured threshold of V1725 so that channel triggers LVDS pulse into chronobox
3) Fan-out trigger out from chronobox to all V1725s.
4) busy signal from each V1725 fed into the chronobox
5) Start run, then push trigger from 20Hz up to 200Hz
6) System running stably!  Actual trigger rate about 60.2Hz.  The almost_full condition is set to 32 on the
V1725s and the estored on each board fluctuates below 32.

https://ds-proto-daq.triumf.ca/HS/Buffers/eStored?hscale=300&fgroup=Buffers&fpanel=eStored&scale=10m

The busy light on the V1725 never comes on, good.

The next thing we need is some way to start/stop the trigger generation on the chronobox, so that at
begin-of-run triggers do not get sent before the V1725s are finished configuration.

Other notes:

a) It turned out that the register to set the V1725 channel trigger threshold was different for RAW vs ZLE
firmware (0x1080 vs 0x1060); after fixing that the channel threshold seemed to work as expected.
b) Map for chronobox NIM cables:

channel 1-4: busy IN from V1725
channel 5,7,8: trigger OUT from chronobox
channel 6: clock OUT from chronobox
  16   05 Mar 2019 14:20 ThomasRoutineHardwareInstalled network card
I installed a PCIe 1Gbps network card and configured it as a private network.  The PC (ds-proto-daq) is
192.168.1.1.  I guess we can make the chronobox 192.168.1.2.

[root@ds-proto-daq ~]# ifconfig enp5s0
enp5s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.1  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::bb27:5db:f778:d584  prefixlen 64  scopeid 0x20<link>
        ether 68:05:ca:8e:66:5c  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 27  bytes 4145 (4.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 17  memory 0xf72c0000-f72e0000  
  15   05 Mar 2019 10:36 ThomasRoutineDigitizerSwitched to standard V1725 firmware
It turns out that the ZLE V1725 firmware we are using only supports reading out up to 4000 samples per channel.
 We need 80000 samples to readout 200us, which is requirement.

So we switch the V1725s to use the backup firmware on the board, which is the standard waveform firmware. 
Firmware version is 17200410.

Expected data size for event with 200us of data: 200000ns * 1/4 ns/sample * 16 ch * 4 boards * 2 bytes/sample =
 6.4MB per event

Measured max event rate of 60Hz with 385MB/s with 200us readout.

Needed to increase max event size, set buffer organization to 6 and set almost full to 32 in order to
accommodate the larger event size.

Changed some registers for different firmware:
- V1725_SELFTRIGGER_LOGIC

More tests needed
  14   19 Feb 2019 16:26 Pierre-A.ConfigurationGeneralOverall HW configuration

For Reference,

I put a simple schematic for the Trigger/Run control.

Bryerton, please have a look. Let's try to issue 3..5 SW trigger before opening the HW trigger.

 

Attachment 1: ds-proto-architecture-02.pdf
ds-proto-architecture-02.pdf
  13   13 Feb 2019 16:45 PierreProblemHardwareA3818 from Marco
Checking again Marco's A3818:
- Port #2 (third from top of card) is acting up.
- change the SFP makes no difference.
- Symptoms: Get stuck on Rx/Tx.
- Need more investigation...
  12   08 Feb 2019 12:08 ThomasProblemHardwareInstalled Marco's A3818; didn't work
I installed Marco's A3818 PCIe card.  Didn't seem to work.  I got communication errors talking to link 2.  The
communication problems didn't happen right away, but happened once the run started.

I swapped the fibres going to port 2 and port 3 on the A3818.  The problem stayed with port 2.  So I conclude
that this A3818 module is no good.
  11   07 Feb 2019 17:30 ThomasRoutineSoftwareTesting the maximum data throughput
First check the maximum trigger rate and maximum data rate for different sample lengths (for each channels):

sample length      Maximum rate   MB/s    CPU % (per thread)
16us               0.44kHz        231     20
6.4us              1.04kHz        214     20  
3.2us              1.95kHz        198     22   
1.6us              3.12kHz        160     26   
0.8us              4.88kHz        127     30   (what are these threads doing?)

Look at the code more.  See that there is a maximum size for the block transfer of 10kB.  Increase this to 130kB
(which is the maximum amount that this board can make per event).  Now find

sample length     Maximum rate       MB/s     CPU % (per thread)
16us              0.68kHz            346      7
6.4us             1.46kHz            300      10 
1.6us             3.6kHz             190      25

Good.  So for long samples we are actually slightly above the maximum transfer rate of 85MB/s*4 = 340MB/s

Tried writing out the data to disk at the maximum 345MB/s rate; the DAQ can't keep up.  Maximum rate was more
like 270MB/s.
I think the mlogger was actually fine.  But I think the write-to-disk speed of the harddrive could not keep up.
 So I think we are limited 
by hardware in that case.  We would need a large raid array to be able to write faster.
  10   06 Feb 2019 14:12 PierreConfigurationHardwareExtended Trigger Time Tag (ETTT)
Confirmed this ETTT configuration is working.

                     ETTT Enabled [22..21] = b10
                      |
                      v
Data [0x811C] = 0x 00 4 D 013C

                ETTT Time [47..32]
                 |  Ch Mask[16..0]  Time[31..0]
                 |    TTTTTT           |
Header 1         v    v    v           v
0xa0001914 0x00 0025 ff 0xff1ca598 0xe9c6e8a1   < event 1
0xa0001914 0x00 0025 ff 0xff1ca599 0xe9c82e25   < event 2

dTime :  0x25e9c82e25 &#8722; 0x25e9c6e8a1 = 0x14584 => 83332
Time interval: 8ns  => 666.7e-6s => Freq: 1500Hz corresponding the current trigger rate

PAA
  9   31 Jan 2019 15:18 PierreConfigurationTriggerTrigger rate
Somehow the trigger rate was not matching the trigger source.
Find out that Link 3 was not collecting and possibly holding the fragment assembly in the main thread.
Swap Link3 <-> Link0 on the V1725, restarted.
Needs further investigation!

Date rate is fine now! CPU load is balanced on all 4 threads (~25%)
- irqbalance disabled
- change affinity for A3818 to cpu9: /etc/rc.local add: echo 0200 > /proc/irq/136/smp_affinity
  Check : watch -n 0.1 'cat /proc/interrupts'

Maximum Trigger rate (HW buffer not rising) 1950 Evt/s => 200MB/s 
for event size of 100KB composed of 4 banks with 32us per channel.
ELOG V3.1.4-cb3afcd8