TEZ-4708: TezTaskUmbilicalProtocol to use a protobuf-based protocol#483
TEZ-4708: TezTaskUmbilicalProtocol to use a protobuf-based protocol#483Aggarwal-Raghav wants to merge 1 commit intoapache:masterfrom
Conversation
|
💔 -1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
very fast implementation of the core design.
I do think the payloads of each message should be proto messages, rather than wrapped Writables, as that's more in line with what proto RPCs are meant to do, and hopefully more robust.
what you do have is already better than today and will handle the removal of WritableRPCEngine
| option java_generic_services = true; | ||
| option java_generate_equals_and_hash = true; | ||
|
|
||
| message GetTaskRequestProto { |
There was a problem hiding this comment.
ideally the protocol should be marshalling the fields in the proto message, rather than just serializing the writable and unmarshalling.
There was a problem hiding this comment.
Ack. Will check
There was a problem hiding this comment.
@steveloughran , can you please provide feedback on the updated TezTaskUmbilicalProtocol.proto. Because of structured mapping vs Wrapped Writable (old approach), the lineage of messages that needs to be defined has increased 😅
TezTaskAttemptID → TezTaskID → TezVertexID → TezDAGID → ApplicationId (Hadoop) → clusterTimestamp (long) + appId (int)
In MAPREDUCE-6706, its still Wrapped Writable as in the top comment in TaskUmbilicalProtocol.proto Could you confirm if the level of granularity I've introduced here is what you had in mind for Tez? I want to ensure I've not over-complicated the .proto definition.
Writable-serialized types are carried as opaque bytes, for example jvm_context and jvm_task
| * limitations under the License. | ||
| */ | ||
|
|
||
| syntax = "proto2"; |
There was a problem hiding this comment.
we can go for proto3 I believe
58f6366 to
b7f12d6
Compare
| /** Called by the Task process to retrieve the work from the AM. */ | ||
| @Override | ||
| public ContainerTask getTask(ContainerContext containerContext) throws IOException { | ||
| return service( |
There was a problem hiding this comment.
Used the same the service and translate FunctionalInterfaces for Exception wrapping as in MR
| protocolName = "org.apache.tez.runtime.internals.protocolPB.TezTaskUmbilicalProtocolBlockingPB", | ||
| protocolVersion = 1) | ||
| public interface TezTaskUmbilicalProtocolBlockingPB | ||
| extends TezTaskUmbilicalProtocol.BlockingInterface, VersionedProtocol {} |
There was a problem hiding this comment.
Have added VersionedProtocol which is additional compared to MR
There was a problem hiding this comment.
the java class naming is kept to be inline with DAGClientAMProtocolBlockingPB
|
💔 -1 overall
This message was automatically generated. |
|
|
ok, ProtoRPCEngine uses protobuf 2.5 public and is frozen if you want to use our 3.x version (currently 3.25)
I'd recommend you do that so no audits of dependencies flag up protobuf 2.5 jars. Recent hadoop releases (3.4.3+) do not bundle these |

No description provided.