4、s DAGScheduler TaskScheduler Worker Stage Cluster Manager Executor Block manager DAG TaskSet Task launch tasks via execute tasksBuildoperatorDAG split graph into stagescluster managerof taskssubmit each stage as retry failed or store and serve ready straggling tasksblocksIntroduct
5、iontoSparkInternals@Matei我们的社区跟进模式 BigJobsSparkRepository 提交PullRequestGithub 压力测试 是否对FolkWeeklyMerge社区有价值?修复BugSparkRepository内部Merge内部Gitlab测试服务器 发布生产服务器通过 生产服务器 我们做了什么?公开的1.Pull Request 681: Remove acOve job from idToAcOveJob when job finished or aborted 2.Pull Request 689: Job
6、s are always marked as SUCCEEDED even it's actually failed on Yarn 3.Pull Request 757: ResultTask's serializaOon forget about handling "generaOon" field, while ShuffleMapTask does 未公开的(和云梯Yarn团队密切相关)1.增加用户权限管理 2.工作jar包缓存机制 3.自动配置Spark临时缓存目录4.封装了一个运行脚本,使用资源文件配置Spark作业性能参数 5.添加一个Syslo