Robopanda Hacks and mods

281 posts / 0 new
Last post
sevik
sevik's picture

Testing of meaning for different bits of 2 (500Hz) band (0 and 1 is the same)

test_bits_for_2band.wav

test_bits_for_2band.aud

Generator code:

#test_bits_for_band2
if 1:
    test_frame_base = silence[:]
    i = 2

    # set volume for band
    bnum, shift = divmod(i,2)
    shift *= 4
    test_frame_base[bnum] |= 14 
milw
milw's picture

so, you must be nearly ready to write an encoder, don't you think? Really nice work, sevik!

sevik
sevik's picture

Already started :)) But need to do some reallife work :)) Will continue at evening :))

#!/usr/local/bin/python

import FFT
import struct
import sys

def main(fn):
    rdata = open(fn,'r').read()
    n = 0
    while len(rdata)>=256:
        data = struct.unpack("H"*128, rdata[:256])

        print "".join(["%6d" % int(abs(d)/1000) for d in FFT.real_fft(data, 128).tolist()])

        rdata = rdata[256:]

if __name__ == "__main__":
    main(sys.argv[1])

sevik
sevik's picture

Another version of test_bits_for_2band

I have done freq analyzis including phase data (.fft file)

test_bits_for_2band_2.wav
test_bits_for_2band_2.aud
test_bits_for_2band_2.fft

sevik
sevik's picture

Updated tools tarball with fft analyzer

Parameters for analyze (in fft.py code):

corr
Due to slightly different clock freqs of robopanda and soundcard analyzer tries to recover right rate by periodic skipping or duplicating samples.

So corr - is amount of samples needed to skip or duplicate for each frame.

This can be found for each wav by tune() function by getting stable phase for 500Hz signal.

resync
For better sync at start of each test sequence analyzer syncs to 500 Hz signal at start of it.

So resync - is list of frames on which this syncing occurs

shift

This is number of samples in first incomplete frame, so fft blocks aligned to start of decoder frame.

Initial value for 500 Hz signal (to align on -90 phase) done in tune().

Further syncing done by aligning of 562 Hz signal to -90 phase in second test sequence by last portion of main.

sevik
sevik's picture

Heh, there was error with phase calculation (atan(a/b) vs atan(b/a))

-90 phase is really 0

sevik
sevik's picture

meaning of bits for 2 band:

nibble 0 - main band offset and phase

0:   500.000    0
1:   562.500    0
2:   562.500  -90
3:   625.000  -90
4:   625.000  180
5:   687.500  180
6:   687.500  +90
7:   750.000  +90
8:   500.000  180
9:   562.500    0
10:  562.500  -90
11:  625.000  -90
12:  625.000  180
13:  687.500  180
14:  687.500  +90
15:  750.000  +90

nibble 1 - subband 562.5sin volume and phase
  bits 3:1 - volume
    0 - mute
    1 - 230
    2 - 400
    3 - 670
    4 - 930
    5 - 1200
    6 - 1550
    7 - 2080
  bit 0 - phase
    0 - 0
    1 - 180

nibble 2 - subband 562.5cos volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - -90
    1 - +90

nibble 3 - subband 625cos volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - -90
    1 - +90

nibble 4 - subband 625sin- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - 180
    1 - 0


nibble 5 - subband 687.5sin- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - 180
    1 - 0

nibble 6 - subband 687.5cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90

nibble 7 - subband 750cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90
sevik
sevik's picture

Testing of band3 bits (3 bytes format): test_bits_for_3band

test_bits_for_3band.wav
test_bits_for_3band.aud
test_bits_for_3band.fft

Generator code:

#test_bits_for_band3
if 1:
    carrier_frame = silence[:]
    carrier_frame[1] |= 14 # main tone volume 14
    carrier_frame[16] |= 14 
sevik
sevik's picture

Band 3 bits meaning

There are no main tone correction data, only subbands data.

nibble 0 - subband 812.5cos volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - -90
    1 - +90

nibble 1 - subband 875cos volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - -90
    1 - +90

nibble 2 - subband 875sin- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - 180
    1 -   0

nibble 3 - subband 937.5sin- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - 180
    1 -   0

nibble 4 - subband 937.5cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90

nibble 5 - subband 1000cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90
sevik
sevik's picture

tools tarball updted with new fft analyzer:

phase correction added for compensation of low-band filter in robopanda

main function code sections ordered according to usage order

sevik
sevik's picture

Testing of band5 bits (1 byte format): test_bits_for_5band

test_bits_for_5band.wav
test_bits_for_5band.aud
test_bits_for_5band.fft

Generator code:


#test_bits_for_band5
if 1:
    carrier_frame = silence[:]
    carrier_frame[1] |= 14 # main tone volume 14
    carrier_frame[16] |= 14 
sevik
sevik's picture

band5 bit values:

nibble 0 - subband 1437.5cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90

nibble 1 - subband 1500cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90

There are high probability of error with phase values, due to big shift of phase in output filter.

sevik
sevik's picture

Testing of band6 bits (1 nibble format): test_bits_for_6band

test_bits_for_6band.wav
test_bits_for_6band.aud
test_bits_for_6band.fft

bits meanings:


nibble 0 - subband 1750.5cos- volume and phase
  bits 3:1 - volume
  bit 0 - phase
    0 - +90
    1 - -90

sevik
sevik's picture

Heh, we are really close :))

Need to recheck subband freqs for other bands and finally will try to encode something :))

3:00am localtime, will continue tommorow :))

If somebody want to help - check frequences of subbands for other bands using test_bits or test_bits_for_all_bands.

sevik
sevik's picture

heh, some combinations of parameters lead to anomalies in output like long or short frames, etc...

Trying to go through test_all_subbands_freqs:

test_all_subbands_freqs.wav
test_all_subbands_freqs.aud
test_all_subbands_freqs.frame - sequence of frames

Generator code:

#test_all_subbands_freqs
if 1:
    silence = [0] * 64 # nibbles
    carrier_frame = silence[:]
    carrier_frame[2] = 14 # main tone volume 14
    carrier_frame[33] = 14 # first subband volume 7, phase 0

    test_frames = []

    for band in range(16):
        sys.stderr.write("band: %d\n" % band)
        if band
sevik
sevik's picture

main volumes for different bands (really frequency response of output filter and volume coefficients)

Values less than 10 omitted due to SNR and big probability of error

      0      1      2      3        4         5      6      7      8      9     10     11     12     13     14     15
     125    250    500    750     1000      1250   1500   1750   2000   2250   2500   2750   3000   3250   3500   3750
0           
1
2
3
4     16     19     18     18       14       14      12     11   
5     28     28     26     26       23       22      20     18     17    14     13     11
6     46     49     48     46       41       40      36     31     24    24     20     16     15     12     11
7     79     81     77     72       65       63      58     51     42    39     34     28     24     20     18     14
8    139    140    132    125      118      107      95     87     75    67     57     50     43     35     28     24
9    233    232    224    214      197      182     163    146    130   114     98     84     71     61     50     41
10   390    385    373    352      331      300     270    242    213   188    163    140    120    100     83     68
11   650    641    620    587      546      498     451    404    357   313    271    233    198    168    139    115
12  1080   1070   1026    973      905      829     748    669    592   518    450    387    330    278    231    191
13  1800   1780   1709   1620     1506     1378    1244   1112    982   861    745    643    548    462    384    317
14  3000   2960   2839   2689     2496     2284    2061   1841   1626  1426   1239   1067    910    765    638    526
15  5000   4920   4715   4469     4141     3785    3416   3047   2590  2360   2049   1765   1503   1267   1057    871

sevik
sevik's picture

subband volumes for main and local volumes combinations


       main   0 1 2 3   4    5    6    7    8     9    10    11    12    13    14    15
local                  18   28   49   77  132   224   373   620  1026  1709  2839  4715
0                      
1                     
2                                                18    27    47    78   132   210   343
3                                                18    27    47    78   132   227   380
4                                     10   18    28    48    84   140   233   380   619
5                                     10   21    29    50    82   138   233   397   655
6                                10   18   27    50    84   138   231   386   635  1043
7                                12   20   29    50    84   138   231   386   653  1081
8                                15   24   43    70   118   195   325   540   880  1467
9                                15   25   45    70   120   195   325   540   910  1505
10                     10   10   17   32   53    89   150   251   414   693  1140  1889
11                     10   10   17   32   53    89   152   251   414   693  1165  1926
12                     11   13   23   42   71   118   194   325   542   902  1483  2455
13                     11   13   24   42   72   118   197   324   542   902  1508  2490
14                     12   20   31   55   95   160   262   435   726  1202  1998  3293
15                     12   21   32   55   95   160   265   435   726  1213  2018  3318

sevik
sevik's picture

Subband values for different bands. Duplicate values has different phases (sin/cos/-sin/-cos)

band  main    sb0     sb1      sb2    sb3    sb4     sb5      sb6    sb7
0        0   offset  62.5     62.5   125.0   125.0   187.5   187.5  250.0
1      250   offset  312.5   312.5   375.0   375.0   437.5   437.5  500.0
2      500   offset  562.5   562.5   625.0   625.0   687.5   687.5  750.0
3      750    812.5  875.0   875.0   937.5   937.5  1000.0     -      -
4     1000   1062.5 1125.0  1125.0  1187.5  1187.5  1250.0     -      -
5     1250   1437.5 1500.0     -       -       -       -       -      -
6     1500   1750.0    -       -       -       -       -       -      -
7     1750   2000.0    -       -       -       -       -       -      -
8     2000   2250.0    -       -       -       -       -       -      -
9     2250   2500.0    -       -       -       -       -       -      -
10    2500   2750.0    -       -       -       -       -       -      -
11    2750   3000.0    -       -       -       -       -       -      -
12    3000   3250.0    -       -       -       -       -       -      -
13    3250   3500.0    -       -       -       -       -       -      -
14    3500   3750.0    -       -       -       -       -       -      -
15    3750   4000.0    -       -       -       -       -       -      -
sevik
sevik's picture

Almost done, only last step - actual encoding remaining :))

I hope this will be done tomorrow :))

milw
milw's picture

Nice work Sevik. I've ordered some SPI flash like nocturnal, but I wondered if either of you has a diagram of (or link to) the cartridge pinout?

Nocturnal
Nocturnal's picture

Hmmm... from left to right, viewing the epoxy side of the board I think it is...

N/A	N/A
Brown	SO
Red	VDD
Orange	CS
Yellow	SI
Green	GND
Blue	CLK
N/A	N/A

I could be wrong though. I connected straight to the socket rather than the cartridge.

sevik
sevik's picture

Main volume coefficients is 0.6^N:

15	1	5000
14	0,6	3000
13	0,36	1800
12	0,216	1080
11	0,1296	648
10	0,07776	388,8
9	0,046656	233,28
8	0,0279936	139,968
7	0,01679616	83,9808
6	0,010077696	50,38848
5	0,0060466176	30,233088
4	0,00362797056	18,1398528
3	0,002176782336	10,88391168
2	0,0013060694016	6,530347008
1	0,00078364164096	3,9182082048
0	0,00047018498458	2,35092492288
sevik
sevik's picture

Local volume coefs has 0.75 multiplier for levels 3-7 and 0.6 for 0,1,2

level 7 = main volume * 0.75

	4	5	6	7	8	9	10	11	12	13	14	15	
1	1,5	2,4	4,1	6,8	11,3	18,8	31,3	52,2	87,0	145,0	241,7	402,8	0,6
2	2,4	4,1	6,8	11,3	18,8	31,3	52,2	87,0	145,0	241,7	402,8	671,3	0,6
3	4,1	6,8	11,3	18,8	31,3	52,2	87,0	145,0	241,7	402,8	671,3	1118,9	0,75
4	5,4	9,0	15,0	25,1	41,8	69,6	116,0	193,3	322,2	537,1	895,1	1491,9	0,75
5	7,2	12,0	20,0	33,4	55,7	92,8	154,7	257,8	429,7	716,1	1193,5	1989,1	0,75
6	9,6	16,0	26,7	44,5	74,2	123,7	206,2	343,7	572,9	954,8	1591,3	2652,2	0,75
7	12,8	21,4	35,6	59,4	99,0	165,0	275,0	458,3	763,8	1273,0	2121,8	3536,2	0,75
main	17,1	28,5	47,5	79,2	132,0	220,0	366,6	611,1	1018,4	1697,4	2829,0	4715,0	
sevik
sevik's picture

Heh :))

Can't say that it works well :)) But it works :))

test_encode.wav
test_encode.aud

Source file:

test_encode_input.wav

Tools tarball updated with current encoder :))

Seems there are bugs in encoder - because spektra of encoded file does not looks as it must look.

sevik
sevik's picture

something better version :))

fixed one bug in encoder, and added window before FFT

test_encode2.wav
test_encode2.aud

Tools tarball updated with current encoder :))

sevik
sevik's picture

It's too fast speaking in input file :)) Robopanda usually speaks slower :))

Anybody has more appropriate source file? :))

sevik
sevik's picture

There is clear remains of 16ms frame rate in decoded spektra. So there is need to write or adopt some real encoder which will use phase information and account for overlaping of output frames.

Anybody with DSP background? :))

milw
milw's picture

You can try encoding this, my son's voice is a closer match to Robopanda... Do you suspect the encoder might be optimized to a higher pitched voice?
realvoice.wav

sevik
sevik's picture

I dont think it's optimized, just this(used in my encodings) wav resampled to 8kHz by itself is not very good :))

I'm writing emulator now, so current understanding can be tested on real data, and to allow working on encoder without real hardware.

sevik
sevik's picture

heh, seems some interactions between parameters is guessed wrong :))

00837_dec.wav decoded version of "Welcome to training mode"

current versions of encoder and decoder in tools tarball :))

Pages