[DOC] jupyter notebook - training, rolling window, multivariate #159

jayzer · 2025-05-08T23:56:45Z

Multivariate Kalman Filter with Rolling Window Example

This example demonstrates how to apply a Multivariate Kalman Filter to smooth noisy time-series data using a rolling window approach. It covers the following:

Simulating data with noise, outliers, and missing values.
Initializing and training the Kalman Filter.
Implementing a rolling window for retraining and smoothing.
Visualizing the results.

This example provides an illustration of how to use the pykalman library in a rolling window use case. Comments include explanations of how to use package for data imputation, outlier handling, and dealing with missing values.

Added an example of a use case, multivariable, with extreme data and missing data.

Example use case of Multivariate Kalman Filter with a rolling window in Jupyter Notebook format.

jayzer · 2025-05-09T00:17:30Z

rewriting second pull request

jayzer · 2025-05-09T00:29:32Z

Multivariate Kalman Filter with Rolling Window Example

This example demonstrates how to apply a Kalman Filter to smooth noisy time-series data using a rolling window approach. It covers the following:

Simulating data with noise, outliers, and missing values.
Initializing and training the Kalman Filter on some initial data points.
Implementing a rolling window of for retraining and smoothing on the rest of the data, an approach that illustrates how the filter could be applied incrementally as new data arrives.
Visualizing the results.
This example provides an illustration of how to use the pykalman library in a rolling window use case. Comments include explanations of how to use package for data imputation, outlier handling, and dealing with missing values.

fkiraly

Quick question, there are two PR that add the same files - which one should we be looking at? Other one is this: #158

I assume this one, i.e., #159, because it is from a branch and not your fork main?

jayzer · 2025-05-13T23:31:41Z

Quick question, there are two PR that add the same files - which one should we be looking at? Other one is this: #158

I assume this one, i.e., #159, because it is from a branch and not your fork main?

Sorry if it was unclear there are two distinct files actually for the exampes section, which I tried to pull request in two PR's. The first one is a more basic example of kf with some data generated with added extremes, missing points and graphed raw data alongside kf-smoothed data. The second one smooths batches in a rolling window, with periodic updates to the kf.

examples/pykalman_multivariate_example.ipynb

examples/pykalman_rolling_window_example.ipynb

thanks

fkiraly · 2025-05-14T06:29:51Z

Thanks!

The question was not about the files, but about the two pull requests, #158 and #159. Both pull requests contain both files, so we can merge at most one of them (without conflicts) - which of the two pull requests is the correct one to look at?

jayzer · 2025-05-14T08:41:48Z

Thanks!

The question was not about the files, but about the two pull requests, #158 and #159. Both pull requests contain both files, so we can merge at most one of them (without conflicts) - which of the two pull requests is the correct one to look at?

I think let's try 159 then? I did not change either Jupyter notebook between 158 and 159, only the pull comment, so I expect the files to be equivalent in that case, with no expected conflict.

jayzer · 2025-05-19T07:24:24Z

any other thoughts on this pull request?

fkiraly

Overall, very nice pull request! We have been looking for someone to help with jupyter notebooks, and these are nice! Also see issue #129.

A few comments:

I would use headers more - you can use markdown cells and hashes such as ### this is a header for headers of different levels. For instance, the header comments in code - "step 1" etc, might look better in markdown, and also allow integration with search and table of contents later
remove the pip install commands at the start - expect that users have an environment already set up. Also, a valid environment may break when running the notebook without caution.
I would also advise to be more concise and to-the-point, avoid long text, use "telegram style" or "bullet point style" more - so users get to the "meat" quicker. For instance, instead of "This demonstration showcases the application of the Kalman Filter for smoothing noisy time-series data using the pykalman library in Python. We'll simulate data with noise, outliers, and missing values.", you could write:
- this notebook: Kalman Filter for smoothing
- on data with noise, outliers, missing values
avoid too long printouts. As a rule of thumb, a printout should be 7 rows max.
dataframes do not need to be printed, you can simply evaluate them on the last line, and they will "pretty display" in jupyter. E.g., write my_df on the last line of a code cell, instead of print(my_df)
the two notebooks could be two chapters in a single notebook? They are quite similar. If you order them, it might be easier to follow for the user.
alternatively, you could number the notebooks, so users read them in the intended order.
re naming, it is clear that the notebooks are about pykalman and tha they contain example-s, you could shorten it to 01_multivariate and 02_rolling_window. Alternatively, if you merge them, then just name it tutorial.ipynb.

jayzer · 2025-05-20T16:57:21Z

Overall, very nice pull request! We have been looking for someone to help with jupyter notebooks, and these are nice! Also see issue #129.

A few comments:

I would use headers more - you can use markdown cells and hashes such as ### this is a header for headers of different levels. For instance, the header comments in code - "step 1" etc, might look better in markdown, and also allow integration with search and table of contents later

remove the pip install commands at the start - expect that users have an environment already set up. Also, a valid environment may break when running the notebook without caution.

I would also advise to be more concise and to-the-point, avoid long text, use "telegram style" or "bullet point style" more - so users get to the "meat" quicker. For instance, instead of "This demonstration showcases the application of the Kalman Filter for smoothing noisy time-series data using the pykalman library in Python. We'll simulate data with noise, outliers, and missing values.", you could write:

this notebook: Kalman Filter for smoothing

on data with noise, outliers, missing values

avoid too long printouts. As a rule of thumb, a printout should be 7 rows max.

dataframes do not need to be printed, you can simply evaluate them on the last line, and they will "pretty display" in jupyter. E.g., write my_df on the last line of a code cell, instead of print(my_df)

the two notebooks could be two chapters in a single notebook? They are quite similar. If you order them, it might be easier to follow for the user.

alternatively, you could number the notebooks, so users read them in the intended order.

re naming, it is clear that the notebooks are about pykalman and tha they contain example-s, you could shorten it to 01_multivariate and 02_rolling_window. Alternatively, if you merge them, then just name it tutorial.ipynb.

ok sure. Noted on the recommendations. number scheme makes sense as well; next jupyter examples could do live update mode or explicitly defined input spaces with unscented filter to complete a tutorial set.

fkiraly · 2025-06-07T14:23:20Z

are you still working on this, @jayzer?

jayzer · 2025-06-17T00:31:59Z

are you still working on this, @jayzer?

Yes, thanks for checking... Had some other project come up. Will come back with some updates in a couple weeks hopefully, unless someone else wants to take a try to make some jupyter notebook documentation.

fkiraly · 2025-06-28T13:06:54Z

it is almost done, only small improvements needed imo - thanks for contributing!

jayzer and others added 2 commits May 8, 2025 19:55

Add files via upload

52181e3

Added an example of a use case, multivariable, with extreme data and missing data.

Add: Multivariate Kalman Filter with Rolling Window Example

ee42dea

Example use case of Multivariate Kalman Filter with a rolling window in Jupyter Notebook format.

jayzer closed this May 9, 2025

jayzer reopened this May 9, 2025

fkiraly requested changes May 11, 2025

View reviewed changes

fkiraly changed the title ~~Add: Multivariate Kalman Filter with Rolling Window Example~~ [DOC] jupyter notebook - training, rolling window, multivariate May 11, 2025

fkiraly added the documentation Documentation & tutorials label May 11, 2025

jayzer requested a review from fkiraly May 13, 2025 23:33

fkiraly mentioned this pull request May 15, 2025

Add a smoothing example, multivariate with missing data and extreme data #158

Closed

fkiraly requested changes May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DOC] jupyter notebook - training, rolling window, multivariate #159

[DOC] jupyter notebook - training, rolling window, multivariate #159

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[DOC] jupyter notebook - training, rolling window, multivariate #159

Are you sure you want to change the base?

[DOC] jupyter notebook - training, rolling window, multivariate #159

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!