8000 [DOC] jupyter notebook - training, rolling window, multivariate by jayzer · Pull Request #159 · pykalman/pykalman · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[DOC] jupyter notebook - training, rolling window, multivariate #159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jayzer
Copy link
@jayzer jayzer commented May 8, 2025

Multivariate Kalman Filter with Rolling Window Example

This example demonstrates how to apply a Multivariate Kalman Filter to smooth noisy time-series data using a rolling window approach. It covers the following:

  • Simulating data with noise, outliers, and missing values.
  • Initializing and training the Kalman Filter.
  • Implementing a rolling window for retraining and smoothing.
  • Visualizing the results.

This example provides an illustration of how to use the pykalman library in a rolling window use case. Comments include explanations of how to use package for data imputation, outlier handling, and dealing with missing values.

jayzer and others added 2 commits May 8, 2025 19:55
Added an example of a use case, multivariable, with extreme data and missing data.
Example use case of Multivariate Kalman Filter with a rolling window in Jupyter Notebook format.
@jayzer jayzer closed this May 9, 2025
@jayzer
Copy link
Author
jayzer commented May 9, 2025

rewriting second pull request

@jayzer
Copy link
Author
jayzer commented May 9, 2025

Multivariate Kalman Filter with Rolling Window Example

This example demonstrates how to apply a Kalman Filter to smooth noisy time-series data using a rolling window approach. It covers the following:

Simulating data with noise, outliers, and missing values.
Initializing and training the Kalman Filter on some initial data points.
Implementing a rolling window of for retraining and smoothing on the rest of the data, an approach that illustrates how the filter could be applied incrementally as new data arrives.
Visualizing the results.
This example provides an illustration of how to use the pykalman library in a rolling window use case. Comments include explanations of how to use package for data imputation, outlier handling, and dealing with missing values.

@jayzer jayzer reopened this May 9, 2025
Copy link
Collaborator
@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question, there are two PR that add the same files - which one should we be looking at? Other one is this: #158

I assume this one, i.e., #159, because it is from a branch and not your fork main?

@fkiraly fkiraly changed the title Add: Multivariate Kalman Filter with Rolling Window Example [DOC] jupyter notebook - training, rolling window, multivariate May 11, 2025
@fkiraly fkiraly added the documentation Documentation & tutorials label May 11, 2025
@jayzer
Copy link
Author
jayzer commented May 13, 2025

Quick question, there are two PR that add the same files - which one should we be looking at? Other one is this: #158

I assume this one, i.e., #159, because it is from a branch and not your fork main?

Sorry if it was unclear there are two distinct files actually for the exampes section, which I tried to pull request in two PR's. The first one is a more basic example of kf with some data generated with added extremes, missing points and graphed raw data alongside kf-smoothed data. The second one smooths batches in a rolling window, with periodic updates to the kf.

examples/pykalman_multivariate_example.ipynb

examples/pykalman_rolling_window_example.ipynb

thanks

@jayzer jayzer requested a review from fkiraly May 13, 2025 23:33
@fkiraly
Copy link
Collaborator
fkiraly commented May 14, 2025

Thanks!

The question was not about the files, but about the two pull requests, #158 and #159. Both pull requests contain both files, so we can merge at most one of them (without conflicts) - which of the two pull requests is the correct one to look at?

@jayzer
Copy link
Author
jayzer commented May 14, 2025

Thanks!

The question was not about the files, but about the two pull requests, #158 and #159. Both pull requests contain both files, so we can merge at most one of them (without conflicts) - which of the two pull requests is the correct one to look at?

I think let's try 159 then? I did not change either Jupyter notebook between 158 and 159, only the pull comment, so I expect the files to be equivalent in that case, with no expected conflict.

@jayzer
Copy link
Author
jayzer commented May 19, 2025

any other thoughts on this pull request?

Copy link
Collaborator
@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, very nice pull request! We have been looking for someone to help with jupyter notebooks, and these are nice! Also see issue #129.

A few comments:

  • I would use headers more - you can use markdown cells and hashes such as ### this is a header for headers of different levels. For instance, the header comments in code - "step 1" etc, might look better in markdown, and also allow integration with search and table of contents later
  • remove the pip install commands at the start - expect that users have an environment already set up. Also, a valid environment may break when running the notebook without caution.
  • I would also advise to be more concise and to-the-point, avoid long text, use "telegram style" or "bullet point style" more - so users get to the "meat" quicker. For instance, instead of "This demonstration showcases the application of the Kalman Filter for smoothing noisy time-series data using the pykalman library in Python. We'll simulate data with noise, outliers, and missing values.", you could write:
    • this notebook: Kalman Filter for smoothing
    • on data with noise, outliers, missing values
  • avoid too long printouts. As a rule of thumb, a printout should be 7 rows max.
  • dataframes do not need to be printed, you can simply evaluate them on the last line, and they will "pretty display" in jupyter. E.g., write my_df on the last line of a code cell, instead of print(my_df)
  • the two notebooks could be two chapters in a single notebook? They are quite similar. If you order them, it might be easier to follow for the user.
  • alternatively, you could number the notebooks, so users read them in the intended order.
  • re naming, it is clear that the notebooks are about pykalman and tha they contain example-s, you could shorten it to 01_multivariate and 02_rolling_window. Alternatively, if you merge them, then just name it tutorial.ipynb.

@jayzer
Copy link
Author
jayzer commented May 20, 2025

Overall, very nice pull request! We have been looking for someone to help with jupyter notebooks, and these are nice! Also see issue #129.

A few comments:

  • I would use headers more - you can use markdown cells and hashes such as ### this is a header for headers of different levels. For instance, the header comments in code - "step 1" etc, might look better in markdown, and also allow integration with search and table of contents later

  • remove the pip install commands at the start - expect that users have an environment already set up. Also, a valid environment may break when running the notebook without caution.

  • I would also advise to be more concise and to-the-point, avoid long text, use "telegram style" or "bullet point style" more - so users get to the "meat" quicker. For instance, instead of "This demonstration showcases the application of the Kalman Filter for smoothing noisy time-series data using the pykalman library in Python. We'll simulate data with noise, outliers, and missing values.", you could write:

    • this notebook: Kalman Filter for smoothing
    • on data with noise, outliers, missing values
  • avoid too long printouts. As a rule of thumb, a printout should be 7 rows max.

  • dataframes do not need to be printed, you can simply evaluate them on the last line, and they will "pretty display" in jupyter. E.g., write my_df on the last line of a code cell, instead of print(my_df)

  • the two notebooks could be two chapters in a single notebook? They are quite similar. If you order them, it might be easier to follow for the user.

  • alternatively, you could number the notebooks, so users read them in the intended order.

  • re naming, it is clear that the notebooks are about pykalman and tha they contain example-s, you could shorten it to 01_multivariate and 02_rolling_window. Alternatively, if you merge them, then just name it tutorial.ipynb.

ok sure. Noted on the recommendations. number scheme makes sense as well; next jupyter examples could do live update mode or explicitly defined input spaces with unscented filter to complete a tutorial set.

@fkiraly
Copy link
Collaborator
fkiraly commented Jun 7, 2025

are you still working on this, @jayzer?

@jayzer
Copy link
Author
jayzer commented Jun 17, 2025

are you still working on this, @jayzer?

Yes, thanks for checking... Had some other project come up. Will come back with some updates in a couple weeks hopefully, unless someone else wants to take a try to make some jupyter notebook documentation.

@fkiraly
Copy link
Collaborator
fkiraly commented Jun 28, 2025

it is almost done, only small improvements needed imo - thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation & tutorials
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0