Transcode Tests

Also look at the following snapshot of explorer captured at the time of launching multiple transcode sessions, and see external/independent workers launched first:

Excellent response Ram. I accept the stochastic nature of the algorithm. Given the amount of total stake on Genesis, wouldn’t the probability of winning work at all times be highest with Genesis unless there is some counter weight? Over time I would anticipate that Genesis would “win” most jobs, particularly as a pool scaling up unlimited instances compared to a single machine.

@Ram_Penke
thanks for the updates Ram! My streams have been working well today until I ran into a slight hiccup. I was uploading 3 streams in parallel (733MB files) getting to about 40% upload in 5-10 minute time frame. I cancelled these uploads midway when I noticed one of the video files was the wrong one. I restarted 3 new streams with the same files and it took 10 minutes to upload 1%, over 1 hour to get to 9% and has since stalled at 9% for another hour. I tested my ISP with speed test maybe thinking they were throttling my upload but it was at the same speeds as before the tests. All three streams froze uploading at the same time and %.

@BluBlu As you provided the stream-ids, I will ask video-infra team to check if the throttling is happening on the VideoCoin network ingest side. The ingest has load-balancing in-place and less likely to cause the throttling. But will look at the issue as well as will try to reproduce it. I will keep you posted on the progress.

@Ram_Penke
I left the 3 streams uploading overnight and they were still frozen at the same % for all 3.

I woke up and did a new batch with the same 3 files and settings. These uploaded extremely quickly (sub 5 minutes for three 733mb files). Attached below is the snip of the successful streams.

Note that parallel 3 was done by BluBlu machine, Parallel 2 was done by genesis pool, and Parallel 1 by BDC. You did mention a delay after the last file was transcoded. This happened to BDC machine in two trials but not with the other 2 machines running in parallel.

I was wondering if some event (yesterday was canceling mid upload) caused a bottleneck in the uploads. It is strange to me that this cleared up overnight and the same 3 streams worked very quickly today.

After a long delay of over 10 minutes, parallel 1 FAILED while the other 2 were successful.

@BluBlu
Thanks for sharing the details and stream-ids. Video-infra team should be able to figure out the issue based on the captured logs. I will update you on the findings on Monday

I was able to recreate some issues multiple times and what is happening is that it only allows me two streams concurrently and any additional streams fail until one of the two ongoing streams finish. When one or both of the streams finish, only then will it allow me to start a new stream successfully.

I was able to bypass the 2 stream limit issue I’ve been having to try to stress test my worker by creating a secondary videocoin account. I was then able to start 3 concurrent streams (2 on account #1, 1 on account #2). Here are some of my thoughts and findings.

-Each individual stream gets fully sent to an individual worker and is not segmented out to different workers. Is this working as intended?

-Santiago’s BDC stream gets stuck for over 20 minutes and ends up failing (3 instances) Below is the stream ID for the BDC failed stall stream

-See attached image below showing that a zone0 worker was pulled up to do the work and chosen over BluBlu to do the work for stream “2/2”

@BluBlu

We still need to look at the issues that you reported earlier. We will also look at (1) 20minutes wait and subsequent failures of specific worker and (2) your observation related to limitation of running multiple instances.

Following is my feedback on issue related to your observation “Each individual stream gets fully sent to an individual worker…”.
In this particular test the file that you used seems to have a file duration of 30 seconds. The pipeline is creating only one segment and sent to one worker. Try testing with longer duration clips and you will see the file is distributed to multiple workers.

@Ram_Penke

Great thank you for that will try testing with longer duration videos.

I ran the following trailer using the link directly from the HD Trailer website:

The process seems to be stuck in a loop of “get CPU capacity” and “cpu capacity is 100”.

My worker shows hung up as shown by the following graphic:

After about a few minutes it finally completed and I got this:

Since I submitted the test I can see what I was charged as a publisher, essentially $0.025 / 60 seconds of video as seen here:

So as a worker I was paid ~ $0.0098 for a 30 second chunk, or roughly $0.02/minute. @Ram_Penke it safe to surmise that VideoCoin’s commission is set at $0.005 / 60 seconds?

@Santiago_Velez

Following is my feedback on your post:

(1) Regarding “The process seems to be stuck in a loop …” ,
it does not seem to stuck. You may see the following msgs, when it is transcoding:
… msg=transcoding." …
… msg=“segment has been transcoded” …
… msg=“uploading segment”, …
… msg=“submitting proof” …
The log that you shown is normal in idle state

(2) Regarding “My worker shows hung up as shown by the following graphic:…”
“busy” state indicates work is assigned. It is not hung up

(3) Regarding “it safe to surmise that VideoCoin’s commission …”
The percentage seems to follow the ratio as described here.

How can that be if the worker gets paid $0.01/30 second work packet ($0.02/minute) but the publisher is charged $0.025/minute of video? Where does the $0.005/minute go if not VideoCoin? Don’t get me wrong, i think this is fair and deserved, I’m simply stating that the link you provided does not represent this commission to the network operator and it should for full transparency.

@Ram_Penke

Another day, another test. This is the first instance where the publisher is charged but the work was not completed and there was no output source. For reference this was a multi segmented video upload from local storage. It was sent to 3 workers (genesis, BDC, BluBlu).

Explorer showed that all three workers were busy with the work. BluBlu’s and genesis pool completed while BDC stalled at “Busy”. After a few minutes the stream ended up failing but publisher was charged. My question would be if 1 segment of the video fails on encoding, all other segments are still charged? Could there be a backup for the failed segment if a worker cannot complete this so that the entire video could be completed?

Here are snips of BluBlu and Genesis pool publicmint explorers showing the payment for the job while BDC’s was unpayed.


This brings up another issue in that in my tests it seems as if BDC is mostly running into stalling problems that you mentioned. How would a worker identify the issues with their specific worker on a failed job? I’m assuming that this would also lead to a slashing event when that is implemented. What if the issue is not on the worker end and has to do with an external source such as a bug?

@Santiago_Velez
All these transactions are recorded on blockchain.
Blockchain event AccountFunded indicates amount transferred to the worker.
Blockchain event ServiceFunded indicates amount retained with the network.

@BluBlu

Good exploration of the network. There are multiple items to be consider here. VideoCoin network operates at API level and Console is Web UI application on top of API.

Payment for a Segment:
At API level each segment is paid only if it is successfully encoded and made available to the publisher. Publisher can retrieve this segment and can pass it to a CDN or perform further processing or save it etc. From this point of view, all the segments that are paid are available to the publisher through API.

Failed or delayed segment transcode:
Slashing when enabled, is expected to enforce the discipline on workers.

Backup for transcoding failed segments:
It should be there in the video-infra. I will check and get back to you confirm if it is enabled.

Finally the failure of the stream reported by Console:
I think it is a bug in the console application, if all the segments of the file are encoded, but fails to identify the completion. We will look at it and update you.

@BluBlu
Feedback on your question regarding network bug vs faulty worker:

  1. Docker containers are expected to provide uniform execution environment.
  2. As part of enabling slashing, a performance metrics will be obtained that basically keeps track of successful transcodes and failures.
  3. When a reasonable number of workers are running, bugs will pop-up across more workers.
    Migration to the UI based worker control panel(under development) will make it easy to alert any worker issues.

In summary, slashing will happen only to workers that are intentionally made to fault or attempting a form of attack on network. Slashing is mainly triggered when a worker submits a false proof-of work. Crashes and downtimes may be used in lowering rank of a worker in selection.

This can be discussed more while introducing slashing.

@BluBlu and others trying the transcoding:
The minimum required credit to run the transcoding is set to $1 from today.

Apologies if overloading with similar requests. In this instance All segments but one were completed and payed. The last segment has been stalled for over 30 minutes while the worker is stuck in “busy”. This is preventing a full video output. As Ram stated, the payed segments are available to publisher but not through the current console. Is there a time cap on a worker to finish this work before it gets sent to another worker to complete?



End result was a 40+ minute wait before last worker came out of “busy” to “idle” and stream failed.

@BluBlu We are looking at the issue and will resolve it soon.