-
Notifications
You must be signed in to change notification settings - Fork 459
new cc format for msnbc? #1681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not super involved in the code these days but someone that is currently active will take a look ASAP. |
Playing with the ccx_encoders_srt.c file I found this "solution" that removes periods in places like the example you showed @williamj77 . However, it also strips periods from the ends of all sentences, which affects expected output in other cases. Thats why I'm not making a pull request because it can affect other use cases. Still, it might help someone else refine the logic |
Hello,
Is the output correct?
If so, can you send me the Windows exe?
I am more of a C#/Java developer.
Sincerely,
William Johnston
From: David
Sent: Wednesday, April 23, 2025 6:19 PM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
321david123 left a comment (CCExtractor/ccextractor#1681)
ccx_encoders_srt.c.zip
Playing with the ccx_encoders_srt.c file I found this "solution" that removes periods in places like the example you showed @williamj77 . However, it also strips periods from the ends of all sentences, which affects expected output in other cases. Thats why I'm not making a pull request because it can affect other use cases. Still, it might help someone else refine the logic
Screenshot.2025-04-23.at.5.17.25.PM.png (view on web)
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
It does work @williamj77 here's the complete file: If you want to try it by yourself on other files, you just have to replace the encoder with the one in my last comment and follow the standard build instructions. Hope this helps!. |
If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed. If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix. |
Carlos:
The cc is correct for a different tv channel.
And I have been experiencing hacker issues.
Again, I was wondering if the output is correct for the updated code.
Sincerely,
William Johnston
From: Carlos Fernandez Sanz
Sent: Friday, April 25, 2025 11:31 AM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
cfsmp3 left a comment (CCExtractor/ccextractor#1681)
If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed.
If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Can anyone build and send me the exe for the updated code?
I am not a native C++ developer anymore.
From: Carlos Fernandez Sanz
Sent: Friday, April 25, 2025 11:31 AM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
cfsmp3 left a comment (CCExtractor/ccextractor#1681)
If the problem is with the input, we shouldn't do anything. Periods can be removed by a post script if needed.
If the problem is that we're not processing the input correctly, then we should figure out what's going on. But if players such as VLC display the periods, then they're just there and there's nothing for us to fix.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hello William
Here is a binary of the latest CCExtractor code.
Hope this helps :)
CCExtractor latest
<https://drive.google.com/file/d/1VetQZd559QRFrGFG-_HvB3BKKRIpg39W/view?usp=share_link>
…On Mon, 28 Apr 2025 at 22:43, William Johnston ***@***.***> wrote:
*williamj77* left a comment (CCExtractor/ccextractor#1681)
<#1681 (comment)>
<br> Can anyone build and send me the exe for the updated code? <br> <br>
I am not a native C++ developer anymore. <br> <br> From: Carlos Fernandez
Sanz <br> Sent: Friday, April 25, 2025 11:31 AM <br> To:
CCExtractor/ccextractor <br> Cc: William Johnston ; Mention <br> Subject:
Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681) <br>
<br> cfsmp3 left a comment (CCExtractor/ccextractor#1681) <br> If the
problem is with the input, we shouldn't do anything. Periods can be
removed by a post script if needed. <br> <br> If the problem is that
we're not processing the input correctly, then we should figure out
what's going on. But if players such as VLC display the periods, then
they're just there and there's nothing for us to fix. <br> <br> —
<br> Reply to this email directly, view it on GitHub, or unsubscribe. <br>
You are receiving this because you were mentioned.Message ID:
***@***.***> <br>
—
Reply to this email directly, view it on GitHub
<#1681 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BFTBVDOTWCN2JKEVG7NJG3D23ZOURAVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZVHEZTGOBTGA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for the exe.
Can you include this updated code?
From: Vatsal Keshav
Sent: Monday, April 28, 2025 4:15 PM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
vats004 left a comment (CCExtractor/ccextractor#1681)
Hello William <br> Here is a binary of the latest CCExtractor code. <br> Hope this helps :) <br> CCExtractor latest <br> <https://drive.google.com/file/d/1VetQZd559QRFrGFG-_HvB3BKKRIpg39W/view?usp=share_link> <br> <br> On Mon, 28 Apr 2025 at 22:43, William Johnston ***@***.***> <br> wrote: <br> <br> > *williamj77* left a comment (CCExtractor/ccextractor#1681) <br> > <#1681 (comment)> <br> > <br> Can anyone build and send me the exe for the updated code? <br> <br> <br> > I am not a native C++ developer anymore. <br> <br> From: Carlos Fernandez <br> > Sanz <br> Sent: Friday, April 25, 2025 11:31 AM <br> To: <br> > CCExtractor/ccextractor <br> Cc: William Johnston ; Mention <br> Subject: <br> > Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681) <br> <br> > <br> cfsmp3 left a comment (CCExtractor/ccextractor#1681) <br> If the <br> > problem is with the input, we shouldn&#39;t do anything. Periods can be <br> > removed by a post script if needed. <br> <br> If the problem is that <br> > we&#39;re not processing the input correctly, then we should figure out <br> > what&#39;s going on. But if players such as VLC display the periods, then <br> > they&#39;re just there and there&#39;s nothing for us to fix. <br> <br> — <br> > <br> Reply to this email directly, view it on GitHub, or unsubscribe. <br> <br> > You are receiving this because you were mentioned.Message ID: <br> > ***@***.***&gt; <br> <br> > <br> > — <br> > Reply to this email directly, view it on GitHub <br> > <#1681 (comment)>, <br> > or unsubscribe <br> > <https://github.com/notifications/unsubscribe-auth/BFTBVDOTWCN2JKEVG7NJG3D23ZOURAVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZVHEZTGOBTGA> <br> > . <br> > You are receiving this because you are subscribed to this thread.Message <br> > ID: ***@***.***> <br> > <br>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Sure, @321david123 's code seems to output the same captions as
before. It'll be work great with a little refining.
Till then, here's a post-processing-binary for removing unnecessary periods [srt-cleaner]
(https://drive.google.com/file/d/1YsXf_y5mRu7JNSSFwxR9eXeVaHofQuUR/view?usp=share_link)
[github](https://github.com/vats004/srt-cleaner)
Use like this :
`$srt-cleaner your_input.srt your_output.srt`
Edit :
I was trying running it with `if (ccx_options.hauppauge_mode){//period-removing-logic}` but @321david123's logic works without that.
Here is the binary of code as of 29 Apr 2025 + period removing logic contributed by @321david123 :
[updated ccxr](https://drive.google.com/drive/folders/1hG-CBe9S82DoL79XneJ49zxywmYBdiF-?usp=share_link)
|
Thanks again.
But please take a look at the srt file with added periods.
Again, can you create an exe with the updated code?
From: Vatsal Keshav
Sent: Tuesday, April 29, 2025 7:19 AM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
vats004 left a comment (CCExtractor/ccextractor#1681)
Sure, @321david123 's code seems to output the same captions as <br> before. It'll be work great with a little refining. <br> Till then, here's a post-processing-binary for removing unnecessary periods [srt-cleaner] <br> (https://drive.google.com/file/d/1YsXf_y5mRu7JNSSFwxR9eXeVaHofQuUR/view?usp=share_link)<br> [](github <br> : https://github.com/vats004/srt-cleaner)<br> <br> Use like this : <br> `$srt-cleaner your_input.srt your_output.srt` <br> <br> On Tue, 29 Apr 2025 at 02:18, William Johnston ***@***.***> <br> wrote: <br> <br> > *williamj77* left a comment (CCExtractor/ccextractor#1681) <br> > <#1681 (comment)> <br> > <br> Thanks for the exe. <br> <br> Can you include this updated code? <br> <br> > <br> <br> <br> From: Vatsal Keshav <br> Sent: Monday, April 28, 2025 4:15 <br> > PM <br> To: CCExtractor/ccextractor <br> Cc: William Johnston ; Mention <br> > <br> Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue <br> > #1681) <br> <br> vats004 left a comment (CCExtractor/ccextractor#1681) <br> <br> > Hello William &lt;br&gt; Here is a binary of the latest CCExtractor code. <br> > &lt;br&gt; Hope this helps :) &lt;br&gt; CCExtractor latest &lt;br&gt; <br> > &amp;lt; <br> > https://drive.google.com/file/d/1VetQZd559QRFrGFG-_HvB3BKKRIpg39W/view?usp=share_link&amp;gt; <br> > &lt;br&gt; &lt;br&gt; On Mon, 28 Apr 2025 at 22:43, William Johnston <br> > ***@***.***&amp;gt; &lt;br&gt; wrote: &lt;br&gt; &lt;br&gt; &amp;gt; <br> > *williamj77* left a comment (CCExtractor/ccextractor#1681) &lt;br&gt; <br> > &amp;gt; &amp;lt; <br> > #1681 (comment)&amp;gt; <br> > &lt;br&gt; &amp;gt; &amp;lt;br&amp;gt; Can anyone build and send me the exe <br> > for the updated code? &amp;lt;br&amp;gt; &amp;lt;br&amp;gt; &lt;br&gt; <br> > &amp;gt; I am not a native C++ developer anymore. &amp;lt;br&amp;gt; <br> > &amp;lt;br&amp;gt; From: Carlos Fernandez &lt;br&gt; &amp;gt; Sanz <br> > &amp;lt;br&amp;gt; Sent: Friday, April 25, 2025 11:31 AM &amp;lt;br&amp;gt; <br> > To: &lt;br&gt; &amp;gt; CCExtractor/ccextractor &amp;lt;br&amp;gt; Cc: <br> > William Johnston ; Mention &amp;lt;br&amp;gt; Subject: &lt;br&gt; &amp;gt; <br> > Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681) <br> > &amp;lt;br&amp;gt; &lt;br&gt; &amp;gt; &amp;lt;br&amp;gt; cfsmp3 left a <br> > comment (CCExtractor/ccextractor#1681) &amp;lt;br&amp;gt; If the &lt;br&gt; <br> > &amp;gt; problem is with the input, we shouldn&amp;amp;#39;t do anything. <br> > Periods can be &lt;br&gt; &amp;gt; removed by a post script if needed. <br> > &amp;lt;br&amp;gt; &amp;lt;br&amp;gt; If the problem is that &lt;br&gt; <br> > &amp;gt; we&amp;amp;#39;re not processing the input correctly, then we <br> > should figure out &lt;br&gt; &amp;gt; what&amp;amp;#39;s going on. But if <br> > players such as VLC display the periods, then &lt;br&gt; &amp;gt; <br> > they&amp;amp;#39;re just there and there&amp;amp;#39;s nothing for us to <br> > fix. &amp;lt;br&amp;gt; &amp;lt;br&amp;gt; — &lt;br&gt; &amp;gt; <br> > &amp;lt;br&amp;gt; Reply to this email directly, view it on GitHub, or <br> > unsubscribe. &amp;lt;br&amp;gt; &lt;br&gt; &amp;gt; You are receiving this <br> > because you were mentioned.Message ID: &lt;br&gt; &amp;gt; <br> > ***@***.***&amp;amp;gt; &amp;lt;br&amp;gt; &lt;br&gt; &amp;gt; &lt;br&gt; <br> > &amp;gt; — &lt;br&gt; &amp;gt; Reply to this email directly, view it on <br> > GitHub &lt;br&gt; &amp;gt; &amp;lt; <br> > #1681 (comment)&amp;gt;, <br> > &lt;br&gt; &amp;gt; or unsubscribe &lt;br&gt; &amp;gt; &amp;lt; <br> > https://github.com/notifications/unsubscribe-auth/BFTBVDOTWCN2JKEVG7NJG3D23ZOURAVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZVHEZTGOBTGA&amp;gt; <br> > &lt;br&gt; &amp;gt; . &lt;br&gt; &amp;gt; You are receiving this because <br> > you are subscribed to this thread.Message &lt;br&gt; &amp;gt; ID: <br> > ***@***.***&amp;gt; &lt;br&gt; &amp;gt; &lt;br&gt; <br> — <br> Reply to <br> > this email directly, view it on GitHub, or unsubscribe. <br> You are <br> > receiving this because you were mentioned.Message ID: ***@***.***&gt; <br> <br> > <br> > — <br> > Reply to this email directly, view it on GitHub <br> > <#1681 (comment)>, <br> > or unsubscribe <br> > <https://github.com/notifications/unsubscribe-auth/BFTBVDM6Q4O7G4RHU5NQDAL232HZ3AVCNFSM6AAAAAB2LOIWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMZWGU2TIMRVGQ> <br> > . <br> > You are receiving this because you commented.Message ID: <br> > ***@***.***> <br> > <br>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi, if you're using windows, then you could easily just run the docker build for testing out different files.
If you wanted to run it with 321david123's new SRT encoder, I've made a branch for the updated code(credit for the code goes to 321david123)
This is for testing files, if there's need for exe, we can prepare one. |
Hello,
I don’t like docker on my machine.
Is there a workaround?
From: Deepnarayan Sett
Sent: Tuesday, April 29, 2025 1:13 PM
To: CCExtractor/ccextractor
Cc: William Johnston ; Mention
Subject: Re: [CCExtractor/ccextractor] new cc format for msnbc? (Issue #1681)
steel-bucket left a comment (CCExtractor/ccextractor#1681)
Hi, if you're using windows, then you could easily just run the docker build for testing out different files.
Here's the instructions to run the main branch, you can just replace the \path\to\video\ with the location of your file and then copy and paste into terminal.
For testing another file, just run docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest <YOURFILE> --hauppauge -o output.srt
git clone https://github.com/CCExtractor/ccextractor.git
cd ccextractor\docker
docker build --platform linux/amd64 -t ccextractor .
copy \path\to\video\all_in_with_chris_hayes_20250326_1958.ts .
docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest all_in_with_chris_hayes_20250326_1958.ts --hauppauge -o output.srt
If you wanted to run it with 321david123's new SRT encoder, I've made a branch for the updated code(credit for the code goes to 321david123)
git clone https://github.com/steel-bucket/ccextractor/ -b 321david123-FIX
cd ccextractor/docker
docker build --platform linux/amd64 -t ccextractor .
copy \path\to\video\all_in_with_chris_hayes_20250326_1958.ts .
docker run --rm -v $(pwd):$(pwd) -w "$(pwd)" --user $(id -u):$(id -g) ccextractor:latest ./all_in_with_chris_hayes_20250326_1958.ts --hauppauge -o output.srt
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Carlos:
Can you take a look at this ts file?
If you remember, you created a custom version of ccextractor for the Hauppuage tv tuner.
There now seems to be periods inserted into the text.
Attached are the files.
You can download the ts file here:
https://www.dropbox.com/scl/fi/4b1y86efag39sjnmm65hs/all_in_with_chris_hayes_20250326_1958.ts?rlkey=tyid6blj5hvsbyhg1mxs9nvr8&st=557jrkq8&dl=0
Again, this is from a Hauppauge tv tuner.
I have experienced issues from a potential hacker.
Sincerely,
William Johnston
ccoutput.zip
The text was updated successfully, but these errors were encountered: