8000 Pre-trained models do not reproduce paper results · Issue #6 · asheshjain399/RNNexp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Pre-trained models do not reproduce paper results #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
una-dinosauria opened this issue Sep 7, 2016 · 16 comments
Open

Pre-trained models do not reproduce paper results #6

una-dinosauria opened this issue Sep 7, 2016 · 16 comments

Comments

@una-dinosauria
Copy link
una-dinosauria commented Sep 7, 2016

Hi!

I'm using the pre-trained models available at https://drive.google.com/open?id=0B7lfjqylzqmMZlI3TUNUUEFQMXc and running generateMotionForecast.py. This produces motion predictions for different activities and models, but I've found that these do not correspond to what is reported in the paper. For reference, here's the figure from the paper that I'm talking about:

selection_007

But, for example, using the lstm3lr_walking model, the checkpoint.pikforecast_error file contains the following values:

T=0 2.87922000885, 0.053318887949
T=1 3.31045722961, 0.0613047629595
T=2 3.72076749802, 0.0689031034708
T=3 4.20972061157, 0.0779577866197
T=4 4.62205123901, 0.0855935439467
T=5 4.9056763649, 0.0908458605409
T=6 5.15456962585, 0.0954549908638
T=7 5.68943977356, 0.105359993875
T=8 6.14819526672, 0.113855466247
T=9 6.47697734833, 0.119944028556
T=10 6.86927509308, 0.127208799124
T=11 7.25948381424, 0.134434878826
T=12 7.56049823761, 0.140009224415
T=13 7.60584354401, 0.140848949552
T=14 7.81918954849, 0.144799813628
T=15 7.99432945251, 0.148043140769
T=16 8.21197509766, 0.152073606849
T=17 8.22490978241, 0.152313143015
T=18 8.21773910522, 0.152180358768
T=19 8.20940303802, 0.152025982738
T=20 8.21308326721, 0.152094140649
T=21 8.08870410919, 0.14979082346
T=22 7.9909658432, 0.147980853915
T=23 7.93785572052, 0.146997332573
T=24 8.08372688293, 0.149698644876
T=25 8.17058372498, 0.151307106018
T=26 8.29908180237, 0.153686702251
T=27 8.29321861267, 0.15357811749
T=28 8.33865356445, 0.154419511557
T=29 8.29992961884, 0.153702393174
T=30 8.31999206543, 0.154073923826
T=31 8.37398910522, 0.155073866248
T=32 8.47292232513, 0.156905964017
T=33 8.59246826172, 0.159119784832
T=34 8.65988731384, 0.160368278623
T=35 8.66351318359, 0.160435423255
T=36 8.65542507172, 0.160285651684
T=37 8.70272254944, 0.161161527038
T=38 8.90265083313, 0.16486389935
T=39 9.08981990814, 0.168329998851
T=40 9.22410964966, 0.170816838741
T=41 9.25332164764, 0.171357810497
T=42 9.3009595871, 0.172239989042
T=43 9.29813861847, 0.172187745571
T=44 9.26357460022, 0.171547681093
T=45 9.19590568542, 0.170294553041
T=46 9.15723419189, 0.169578418136
T=47 9.24366569519, 0.171178996563
T=48 9.30495262146, 0.172313943505
T=49 9.25953674316, 0.171472907066
T=50 9.24114990234, 0.171132400632
T=51 9.26937294006, 0.171655058861
T=52 9.3104429245, 0.172415614128
T=53 9.19757270813, 0.170325413346
T=54 9.04441356659, 0.167489141226
T=55 8.96823406219, 0.166078403592
T=56 9.00592136383, 0.166776314378
T=57 9.09947776794, 0.168508842587
T=58 9.06608009338, 0.167890369892
T=59 9.1175775528, 0.168844029307
T=60 9.23169708252, 0.170957356691
T=61 9.25059127808, 0.171307250857
T=62 9.23868370056, 0.171086728573
T=63 9.21300506592, 0.170611202717
T=64 9.20988750458, 0.170553475618
T=65 9.30304908752, 0.172278687358
T=66 9.30745029449, 0.17236019671
T=67 9.29339599609, 0.172099933028
T=68 9.21964550018, 0.170734182
T=69 9.22905826569, 0.170908480883
T=70 9.11111068726, 0.168724268675
T=71 9.0918712616, 0.168367981911
T=72 8.92658901215, 0.165307208896
T=73 8.91659736633, 0.165122166276
T=74 8.82111263275, 0.163353934884
T=75 8.90966320038, 0.16499376297
T=76 9.02032756805, 0.167043104768
T=77 9.09782981873, 0.168478325009
T=78 9.22392463684, 0.170813426375
T=79 9.33905029297, 0.172945380211
T=80 9.31301212311, 0.172463193536
T=81 9.44260978699, 0.174863144755
T=82 9.45653438568, 0.17512100935
T=83 9.52670955658, 0.176420554519
T=84 9.64883327484, 0.178682103753
T=85 9.83387374878, 0.182108774781
T=86 9.95151329041, 0.184287279844
T=87 9.91870689392, 0.183679759502
T=88 9.91715335846, 0.18365098536
T=89 10.0150337219, 0.18546359241
T=90 9.95522022247, 0.184355929494
T=91 9.70408630371, 0.179705306888
T=92 9.56737327576, 0.1771735847
T=93 9.58298301697, 0.177462652326
T=94 9.52612495422, 0.176409721375
T=95 9.55842971802, 0.177007958293
T=96 9.53139877319, 0.176507383585
T=97 9.50600910187, 0.176037207246
T=98 9.59951972961, 0.177768886089
T=99 9.80951976776, 0.181657776237

where the left and right columns correspond to skel_err and err_per_dof as computed in forecastTrajectories.py#L124

skel_err = np.mean(np.sqrt(np.sum(np.square((forecasted_motion - trY_forecasting)),axis=2)),axis=1)
err_per_dof = skel_err / trY_forecasting.shape[2]

I find one value to be much worse, and the other to be about 1 order of magnitude better. Do you have any pointers as to what I could be doing wrong?

@una-dinosauria una-dinosauria changed the title Cannot reproduce paper results Pre-trained models do not reproduce paper results Sep 12, 2016
@una-dinosauria
Copy link
Author

Hey! Sorry for bothering again. I've also made some movies with these models and they definitely do not correspond to what is shown in the official video of the paper -- maybe you didn't upload the final final models?

@asheshjain399
Copy link
Owner

The final models are here: https://drive.google.com/open?id=0B7lfjqylzqmMZlI3TUNUUEFQMXc (same link as above). The numbers is Table 1 are Euler angle errors, and not exponential map error. I think you are outputting exponential map errors.

The model are trained on exponential map representation of joints, the output is then converted to Euler angle representation for visualization and quantitative comparison.

@asheshjain399
Copy link
Owner

You should look into the Utils directory. It has some Matlab scripts that the do the conversion for you (Sorry, Utils is not documented yet).

@una-dinosauria
Copy link
Author
una-dinosauria commented Sep 17, 2016

Thanks a lot. I looked into the utils directory and found this motionGenerationError.m file that computes error with the conversion expmap->rotmat->euler. When I run this on the generated motion of pre-trained models, I get the following errors:

erd walking        [0.93 1.18 1.59 1.97 2.24 ]
lstm3lr walking    [0.77 1.00 1.29 1.74 1.84 ]
srnn walking       [0.81 0.94 1.16 1.48 1.78 ]
erd eating         [1.27 1.45 1.66 1.95 2.02 ]
lstm3lr eating     [0.89 1.09 1.35 1.66 1.97 ]
srnn eating        [0.97 1.14 1.35 1.62 2.09 ]
erd smoking        [1.66 1.95 2.35 2.63 3.61 ]
lstm3lr smoking    [1.34 1.65 2.04 2.30 2.59 ]
srnn smoking       [1.45 1.68 1.94 2.24 2.64 ]
erd discussion     [2.27 2.47 2.68 2.92 3.16 ]
lstm3lr discussion [1.88 2.12 2.25 2.33 2.45 ]
srnn discussion    [1.22 1.49 1.83 2.07 2.24 ]

These results are a bit better than those reported in Table 1 :) -- Do you have an idea of what could be causing the discrepancy? I've noticed that the code ignores the global rotation and translation (e.g. motionGenerationError.m#L35 sets them to zero); I experimented with setting only the global rotation to zeros and I get slightly worse results, but still better than those in Table 1. However, if I completely comment that line (e.g., I add global rotation and global rotation), I get the following results:

erd walking        [4.69 11.71 33.38 62.81 106.05 ]
lstm3lr walking    [4.30 10.24 29.06 51.01 83.57 ]
srnn walking       [6.94 15.12 32.33 57.26 89.95 ]
erd eating         [4.87 10.07 17.09 27.64 38.08 ]
lstm3lr eating     [6.24 12.47 22.04 40.35 82.21 ]
srnn eating        [5.05 9.32 14.75 23.29 35.91 ]
erd smoking        [4.20 7.77 15.42 31.38 51.39 ]
lstm3lr smoking    [3.75 7.26 14.21 21.88 31.83 ]
srnn smoking       [4.44 8.15 14.34 22.08 31.76 ]
erd discussion     [5.95 13.99 33.42 59.98 111.36 ]
lstm3lr discussion [9.55 20.88 42.32 62.46 82.69 ]
srnn discussion    [9.40 19.81 39.03 59.83 99.46 ]

Which are definitely much worse.

Thanks again for getting back to me; we seem to be getting closer to reproducing the results in the paper.

@una-dinosauria
Copy link
Author

As a side note, I'm assuming that everything is at 25fps, right? Since you have 8x less data than what can be downloaded from human3.6m, and that is sampled at 200fps. Hence, in the error vector I'm using the indices [2,4,8,14,25] which correspond to [80, 160, 320, 560 and 1000] milliseconds.

@asheshjain399
Copy link
Owner

The errors reported in table 1 only include the Euler angles, and does not include the global translation and rotation errors. This similar to Fragkiadaki et al. ICCV'15.

We used mocap data at 100hz (down sampled by 2) and not at 25Hz. The reason you see less data is because we don't use all the data from human3.6. The details on the sequences we used can be found in the experiment section of the paper. Just to reiterate, our experiment settings (to the best of our effort) are very similar to Fragkiadaki et al.

@Seleucia
Copy link
Seleucia commented Dec 4, 2016

Hello @asheshjain399 , Thank you very much for releasing code and pre-trained models. I'm trying to reproduce the your results but i couldn't manage it.

I see that you are normalizing the data, so prediction is also normalised. are you computing error over normalized data? or are you unnormalizing your prediction with data statistics.

I used the motionGenerationError.m file to generate error, but it seems it is expecting the prediction vector should be 99 dimensional, but on the other hand code produces 54 dimensional vector. I modify motionGenerationError.m file to handle this but i'm not sure if that is correct way or not.

Another thing is that i see that you are computing direct 2d L2 loss between each angles, shouldn't it be 3d loss between joints with this your error will be considerebly less.

@pvmilk
Copy link
pvmilk commented Jul 24, 2017

@una-dinosauria I am trying to reproduce your result by modifying motionGenerationError.m to calculate error from forecast_N_n and ground_truth_forecast_N_n, n in [0, 23].

lstm3lr walking [0.77 1.00 1.29 1.74 1.84 ]

Note : I believe that this is the same value that appears in your paper (cvpr2017; on human motion prediction using RNN).

However, I got the number that is very different from your result and the result in srrn paper.

lstm3lr walking [7.7294, 8.7923, 8.7971, 9.2380, 9.1237]

The only modification I done on motionGenerationError.m is

And other differences could be

  • The code is running on octave, rather than matlab
    • (only warning) warning: RNNexp/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/RotMat2Euler.m: possible Matlab-style short-circuit operator at line 34, column 16
  • The prediction was running using Theano 0.9.0

Am I missing something here? e.g. unnormalized that data, consider only n in [0,7].

@Seleucia So you manage to solve your issue of reproducing the results from srnn paper using a pre-trained model?

Thank you.

@una-dinosauria
Copy link
Author

I did manage to get the numbers that I reported, and I remember them being reasonably close to what the SRNN paper reports. Have you made movies for your predictions? If I remember correctly, the movie for discussion was exactly was is shown in the official SRNN movie, but I never managed to get the other ones.

I'm currently away at a conference but if you make your branch public I can look at the code once I get back to the lab (and make a diff with my code to see if there's something noticeably different).

@Seleucia
Copy link

Yes, i manage to get exactly same number given at the srnn paper with pre-trained models. I tthink confusion here is related with the subsampling. @una-dinosauria is right that given data here is 25ps not 100fps.

@pvmilk
Copy link
pvmilk commented Jul 25, 2017

@una-dinosauria No, I haven't made movies for the predictions yet. Let's me try a couple things on my own. If it is still not working still, I will ask you a favor for a diff.

@Seleucia Do you make any change to the source code more than motionGenerationError.m as mentioned earlier (99->54)? Can you also elaborate more on the subsampling issue?

Thank you.

@Seleucia
Copy link
Seleucia commented Jul 25, 2017

I did not make any changes except that I mention here. Subsampling issue was related with the selected frames, SRNN paper is reporting the frames: [8, 16, 32, 56, 100], not the one @una-dinosauria said: [2,4,8,14,25]. SRNN paper assumes that they subsampled by 2, on the other @una-dinosauria paper assuming that subsampled by 8. I think @una-dinosauria is right, given time at SRNN paper is wrong.

@pvmilk
Copy link
pvmilk commented Jul 25, 2017

@una-dinosauria I tried to duplicate your result as mentioned, but without success.
Could you have a look into it when you have time?

Thank you.

Here is what I did and my result:

1.) I use both srnn branch of both RNNexp (@3ba986b) and NeuralModels (@fb02335).

2.) Changes I made is to make the program run, and they can be found in patch_srnn.txt

3.) I download the data and pre-trained model, then forecase the motion with

$ python generateMotionForecast.py lstm3lr `datapath`/pre-trained/lstm3lr_walking/ 

4.) I calculate the error using matlab script

$ octave

octave:1> merr = motionGenerationError('`datapath`/pre-trained/lstm3lr_walking/');

(I actually use octave here, also I need to download H3.6m visualize code version 1.1 and extract it under RNNexp/structural_rnn/CRFProblems/H3.6m/h36devkit folder).

lstm3lr walking

merr([2,4,8,14,25]) = 3.0110 3.9911 4.8584 6.8636 7.1127
Below is the error value for a 100 predicted frames.

merr =

   2.5716
   3.0110
   3.5399
   3.9911
   4.3237
   4.6516
   4.5892
   4.8584
   5.4216
   5.6991
   6.5395
   7.0668
   7.1334
   6.8636
   7.0044
   7.0685
   7.9258
   7.9086
   7.5250
   7.8908
   7.2020
   7.1370
   7.3037
   7.1200
   7.1127
   7.0545
   7.2075
   7.1480
   7.2220
   7.1470
   6.9809
   7.0547
   7.2562
   7.2793
   7.2724
   7.2901
   7.2155
   6.9784
   7.2433
   7.0532
   7.4498
   7.2638
   7.2666
   7.6310
   7.5034
   7.2594
   7.4710
   7.3735
   7.4623
   7.0478
   6.8761
   6.9157
   6.8739
   6.9698
   6.6872
   6.9685
   7.0161
   6.8627
   6.8614
   6.8071
   6.7301
   6.9461
   6.6581
   6.6281
   6.8499
   7.2705
   7.5901
   7.7002
   7.4472
   7.4562
   7.5396
   7.4184
   7.1077
   6.9915
   6.7552
   6.6909
   6.5945
   6.6490
   7.0078
   7.3325
   7.2949
   7.3203
   7.5912
   7.4449
   7.7315
   7.7443
   7.5951
   7.7246
   7.5485
   7.5036
   7.3329
   7.3004
   7.2497
   7.2188
   7.4018
   7.7028
   7.6853
   8.0556
   8.3116
   8.2240

@pvmilk
Copy link
pvmilk commented Jul 26, 2017

@una-dinosauria I think I got it already. The output of the prediction from generateMotionForecast.py needed to be unnormalised before calculating the error with motionGenerationError.m.

There is a unnormalised method provide in unNormalizeData.py, but you would need to modify the source code to do it yourself.

For those who is following the thread, I will provide the patch once I clean my code.

Thank you.

@pvmilk
Copy link
pvmilk commented Jul 27, 2017

As promised, please replace the following patch srnn_patch.txt in step 2.) I provided above.

With this, you should be able to reproduce the same/similar result as the Structural-RNN for lstm3lr and erd case.

lstm3lr walking

merr([8,16,32,56,100]) =   1.1697 1.4747 1.6444 1.7967 2.1886

erd walking

merr([8,16,32,56,100]) =  1.3010 1.5636 1.8428 2.005 2.3858

Please note that if I used merr([2,4,8,14,25]) the different is slightly better than the one report in the baseline paper (cvpr2017; on human motion prediction using RNN).

lstm3lr walking

merr([2,4,8,14,25]) =   0.67755 0.88913 1.16974 1.41097 1.59932

erd walking

merr([2,4,8,14,25]) =   0.85603 1.04604 1.30096 1.52555 1.71511

UPDATE (9 August 2017):
For those who also tried to duplicate the result for other action (eating, smoking, discussion), you may need to look into the parameters 'actions' in 'RNNexp/structural_rnn/CRFProblems/H3.6m/processdata.py'

@MAtthewGHuser
Copy link

Thanks a lot. I looked into the utils directory and found this motionGenerationError.m file that computes error with the conversion expmap->rotmat->euler. When I run this on the generated motion of pre-trained models, I get the following errors:

erd walking        [0.93 1.18 1.59 1.97 2.24 ]
lstm3lr walking    [0.77 1.00 1.29 1.74 1.84 ]
srnn walking       [0.81 0.94 1.16 1.48 1.78 ]
erd eating         [1.27 1.45 1.66 1.95 2.02 ]
lstm3lr eating     [0.89 1.09 1.35 1.66 1.97 ]
srnn eating        [0.97 1.14 1.35 1.62 2.09 ]
erd smoking        [1.66 1.95 2.35 2.63 3.61 ]
lstm3lr smoking    [1.34 1.65 2.04 2.30 2.59 ]
srnn smoking       [1.45 1.68 1.94 2.24 2.64 ]
erd discussion     [2.27 2.47 2.68 2.92 3.16 ]
lstm3lr discussion [1.88 2.12 2.25 2.33 2.45 ]
srnn discussion    [1.22 1.49 1.83 2.07 2.24 ]

These results are a bit better than those reported in Table 1 :) -- Do you have an idea of what could be causing th 7079 e discrepancy? I've noticed that the code ignores the global rotation and translation (e.g. motionGenerationError.m#L35 sets them to zero); I experimented with setting only the global rotation to zeros and I get slightly worse results, but still better than those in Table 1. However, if I completely comment that line (e.g., I add global rotation and global rotation), I get the following results:

erd walking        [4.69 11.71 33.38 62.81 106.05 ]
lstm3lr walking    [4.30 10.24 29.06 51.01 83.57 ]
srnn walking       [6.94 15.12 32.33 57.26 89.95 ]
erd eating         [4.87 10.07 17.09 27.64 38.08 ]
lstm3lr eating     [6.24 12.47 22.04 40.35 82.21 ]
srnn eating        [5.05 9.32 14.75 23.29 35.91 ]
erd smoking        [4.20 7.77 15.42 31.38 51.39 ]
lstm3lr smoking    [3.75 7.26 14.21 21.88 31.83 ]
srnn smoking       [4.44 8.15 14.34 22.08 31.76 ]
erd discussion     [5.95 13.99 33.42 59.98 111.36 ]
lstm3lr discussion [9.55 20.88 42.32 62.46 82.69 ]
srnn discussion    [9.40 19.81 39.03 59.83 99.46 ]

Which are definitely much worse.

Thanks again for getting back to me; we seem to be getting closer to reproducing the results in the paper.

Hi, I also try to reproduce the code. And I use motionGenerationError.m to convert the data expmap->rotmat->euler. But the result what I got is like that:

srnn walking       [4.57 5.12 5.95 6.04 7.43 ]    (skel_err)
                           [0.10 0.11 0.12 0.13 0.15 ]    (err_per_dof) 

It's different from your result

srnn walking       [0.81 0.94 1.16 1.48 1.78 ]

And it is also different from results in the paper.

So maybe I have a look at your reproduce code? I appreciate you so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
0