Search
 
SCRIPT & CODE EXAMPLE
 
CODE EXAMPLE FOR PYTHON

Save this RDD as a SequenceFile of serialized objects

tmpFile = NamedTemporaryFile(delete=True)
tmpFile.close()
sc.parallelize([1, 2, 'spark', 'rdd']).saveAsPickleFile(tmpFile.name, 3)
sorted(sc.saveAsPickleFile(tmpFile.name, 5).map(str).collect())
# ['1', '2', 'rdd', 'spark']
Source by spark.apache.org #
 
PREVIOUS NEXT
Tagged: #Save #RDD #SequenceFile #serialized #objects
ADD COMMENT
Topic
Name
1+2 =