Background
• MPEG : Moving Picture Experts Group: a
working group of ISO/TEC
“Compactly representing digital video and audio signal
for consumer distribution”
• MPEG-1: Standard for storage and retrieval of audio
and video on storage media
• MPEG-2: Standard for digital TV
Scope of MPEG-4 Standard
• Author: greater production flexibility and
reusability
• Network Service Provider: Offering
transport information which can be
interpreted on various network platforms
• End user: Higher levels of interaction with
content within the limits set by the author.
Objectives
• Interactivity : Interacting with the different
audio-visual objects
• Scalability : Adopting contents to match
bandwidth
• Reusability : For both tools and data
Objectives - Interactivity
• Client Side Interaction
– Manipulating scene description and properties of audio-
visual objects
• Audio-Visual Objects Behavior
– Triggered by user actions and other events
• Client Server Interaction
– In case a return channel is available
Objectives - Scalability
• Scalability refers to the ability to only decode a
part of a bitstream and reconstruct images or
image sequences with:
– Reduced decoder complexity (reduced quality)
– Reduced spatial resolution
– Reduced temporal resolution
• A scalable object is the one that has basic-quality
information for presentation. When enough bitrate
or resources can be assigned, enhancement layers
can be added for improving quality.
Objectives – Scalability (Cont.)
• Scalability is a key factor in many applications:
making moving video possible at very low bitrates
notably for mobile devices
• MPEG-4 has been found usable for streaming
wireless video transmission at 10Kbps in GSM.
• Low bitrates are accommodated by the use of
scalable objects.
Objectives - Reusability
• Authors can easily organize and manipulate
individual components and reuse existing
decoded objects.
• Each type of content can be coded using the
most effective algorithms.
Requirements
• Traditional Requirements (MPEG-1 & 2)
– Streaming : for live broadcast
– Synchronization : to process data received at the right
instants of time
– Stream Management : to allow the application to
consume the content (content type, dependencies…etc.)
– Intellectual Property Management
• Specific MPEG-4 Requirements
– Audio-Visual objects
– Scene description
Audio-Visual Objects
• The representation of a natural or synthetic
object that has an audio and/or visual
manifestation
• Examples:
– Video Sequence (with Shape information).
– Audio Track
– Animated 3D face
– Speech synthesized from text.
• Advantages: Interaction – Scalability – Reusability
Scene Description
The coding of information that describes the
spatio-temporal relationships between the
various audio-visual objects.
Scene Description (Cont.)
• Place media objects anywhere in a given
coordinate system.
• Apply transforms to change the geometrical or
acoustical appearance of a media object.
• Group primitive media objects to form compound
media objects.
• Apply streamed data to media objects to modify
their attributes (sound, moving texture…)
• Change, interactively, the user’s viewing and
listening point anywhere in the scene.
Logical Structure of a Scene
Person Background Video
Shape Voice
Scene
Synthetic Objects
Ball Table
Scene Description (Cont.)
• Starting from VRML, MPEG has developed a
binary language called BInary Format for Scenes
(BIFS).
• The standard differentiates parameters used to
improve the coding efficiency of an object (motion
vectors in video coding), and the ones used as
modifiers of an object (its position in the scene)
• Modification in the latter set does not imply re-
decoding the primitive media objects.
MPEG-4 Mission
Develop a coded, streamable representation
for audio-visual objects and their
associated time-variant data along with a
description of how they are combined.
MPEG-4 Mission (Cont.)
• Coded Vs. Textual
• Streamable Vs. Downloaded
• Audio-Visual objects Vs. Individual Audio
or Visual Streams
Object Model
• Visual objects in the scene are described
mathematically and given a position in two
or three dimensional space. Similarly, audio
objects are placed in sound space.
• “Create once, access everywhere” ..objects
are defined once and the calculations to
update the screen and sound are done
locally.
Objectifying the Visual
• Classical video (from the camera) is one of
the visual objects defined in the standard.
• Objects with arbitrary shapes can be
encoded apart from their background and
can be described in two ways.
– Binary Shape: for low bitrate environments
– Gray Scale (Alpha Shape): for higher quality
content.
Objectifying the Visual (Cont.)
• MPEG does not specify how shapes are to
be extracted. Current methods still have
limitations (e.g. Weatherman).
• MPEG-4 specifies only the decoding
process. Encoding is left to the market
place.
2D Animated Meshes
• A 2D mesh is a partition of a 2D planar region into
polygonal patches.
• A 2D dynamic mesh refers to a 2D mesh geometry
and motion information.
2D Animated Meshes (Cont.)
• The most entertaining feature in MPEG-4 is the
ability to map images onto computer generated
shapes (meshes currently 2D and 3D in the next
version).
• A few parameters to deform the mesh can create
the impression of moving video from a still video
(e.g. a waving flag).
• Predefined faces are particularly interesting
meshes. Any feature (lips or eyes) may be
animated by special commands that make them
move in synchronization with speech.
System Architecture
• Streaming data for media objects.
• Different architecture layers
– Delivery layer
– Sync layer
– Compression layer
– Composition layer
• Syntax Description
Streaming data for media objects
• Needed data for media objects can be
conveyed in one or more Elementary
Streams (ESs).
• An Object Descriptor (OD) identifies all
streams associated with one media object.
• OD contains a set of descriptors that
characterized the ESs (required decoder
resources, encoder timing,..)
FlexMux FlexMux
Various Transport Protocols
SL SL SL
Scene
ODs
Sync Layer
Compression
Layer
Delivery
Layer
FlexMux
SL
Composition Layer
DAI
DNI
ES ES
ES
ES
SL
Delivery Layer
• Contains two-layer multiplexer
– FlexMux: a tool defined according to the DMIF
(Delivery Multimedia Integration Framework). It
allows grouping of ESs with a low overhead.
– TranMux: the second layer that offers transport
service interfaces with different transport
protocols (UDP/IP- MPEG-2,….)
DMIF
• A session protocol for the management of
multimedia streaming over generic delivery
technologies.It is similar to FTP.
• Actions:
– Setup a session with the remote side
– Select streams and request streaming them
– The peer will return pointers to the streams connections
– Establish the connection themselves.
• User can specify QoS the DAI. It is up to the
DMIF to ensure the satisfaction of these
requirements.
Delivery layer (Cont.)
• The functionality of the DMIF is expressed by an
interface called DMIF Application Interface (DAI)
• DAI defines a single, uniform interface to access
multimedia contents on a multitude of delivery
technologies.
• DAI is the reference point at which the elementary
streams can be accessed as Sync layer –
packetized streams.
• Sync layer talks to the delivery layer through DAI.
Sync Layer
• SL A flexible and configurable packetization
facility that allows: Timing, Fragmentation,
and continuity information on associated
data packets. (Packetized Elementary Streams)
• It does not provide frame information (no
packet length in header). Delivery layer will
do it.
Sync Layer Functionality
• Identifying time stamped Access Units (data units
that comprise complete representation unit).
• Each packet is an access unit or a fragment of an
access unit.
• These access units forms the only semantic
structure of ESs in this layer.
• Stamping access units includes timing information
for decoding and composition.
• SL retrieves ESs from packetized ESs.
Compression Layer
• The streams are sent to their respective
decoders that process the data and produce
composition units.
• In order to relate ESs to media objects
Object Descriptors (OD) are used to convey
information about the number and
properties of a set of ESs that belongs to a
media object.
Compression Layer (Cont.)
• Scene Description: Defines
– The spatial and temporal position of the various
objects
– The objects dynamic behavior
– Interactivity features
• The scene description contains unique
identifiers that point to object descriptors.
• Tree structured and based on VRML
Composition Layer
• Using scene description and decoded audio-
visual object data to render the final scene
presented to user.
• MPEG-4 does not specify how information
is rendered
• Composition is performed at the receiver
System Principles
ESD ESD ESD
OD OD
Scene Descriptor ES
Object Descriptor ES
ES
ES
ES
Scene
Decoding Buffer Architecture
DMIF
Application
Interface
Decoder
Buffer
Decoder
Decoder
Decoder
C
O
M
P
O
S
I
T
O
R
Decoder
Buffer
Decoder
Buffer
Decoder
Buffer
Memory
Memory
Memory
Syntax Description
• MPEG-4 defines a syntactic description
language (MSDL) to describe the exact
binary syntax for bitstreams carrying media
objects and for bitstreams with scene
description information
• This language is an extension of C++, and is
used to describe the syntactic representation
of objects
Tools
• Stream Management: The Object
Description Framework (ODF)
• Timing and synchronization: The System
Decoder Model (SDM)
• Presentation Engine: (BIFS)
Tools - ODF
• Provides the glue between the scene description and the
elementary streams.
• Unique identifiers are used in the scene description to point
to the OD.
• The OD is a structure that encapsulates the setup and
association information for a set of ES’s.
• OD’s are transported in dedicated ES’s called Object
Descriptor Streams (ODS).
• This makes it possible to associate timing information to a
set of OD’s.
• Provides mechanisms to describe a hierarchical relations
between streams reflecting scalable encoding of the
content.
Tools - ODF (Cont.)
• The initial OD, a derivative of the object
descriptor is a key element necessary for
accessing MPEG-4 content.
• Contains at least two elementary stream
descriptors:
– One point to the scene description stream.
– Others may point to object descriptor stream.
Initial Object Descriptor
ESD ESD ESD
OD OD
Scene Descriptor ES
Object Descriptor ES
Scene
Scene Descriptor Stream
Object Descriptor Stream
Initial OD
Tools - SDM
• An adaptation for the MPEG-2 System
Target Decoder (that describes temporal and
buffer constraints for packetizing ES’s).
• MPEG-4 chose not to define multiplexing
constraints in the SDM.
• SDM assumes the concurrent delivery of
already demultiplexed ES’s to the decoder
buffer.
Tools - BIFS
• A set of nodes to represent the primitive
scene objects, the scene graph constructs,
the behavior and activity.
• BIFS scene tells where and when to render
the media
Tools - BIFS (Cont.)
• Used to describe scene decomposition
information.
– Spatial and Temporal locations of objects.
– Object attributes and behavior.
– Relationships between elements in the scene
graph.
• Relies heavily on VRML.
VRML
A file format for describing 3D interactive
worlds(scenes) and objects. It may be used
in conjunction with the WWW. It may be
used to create 3D representation of
complex scenes as in virtual reality
representation.
VRML Example – Shape node
shape{
geometry IndexedFaceSet{
coordindex [0, 1, 3, -1, 0, 2, 5, -1, …]
coord Coordinate {point[0.0 5.0 ..]}
color Color {rgb [0.2 0.7…]}
normal Normal {vector[0.0 1.0 0.0 ..]}
textCoord Texture Coordinate {point [0 1.0 ,..]}
}
appearance Appearance {material Material {transperancy 0.5}}
}
BIFS vs. VRML
• VRML lacks important features:
– The support of natural audio and video.
– Timing model is loosely specified.
– VRML worlds (scenes) are often very large.
• BIFS is a superset of VRML.
– A binary format not a textual format (shorter)
– Real-time streaming
– Definition of 2D objects
– Facial Animation
– Enhanced Audio
BIFS Protocols
• BIFS Scene Compression (text vs. binary)
• BIFS Command (produce unique-time events for
the scene.)
– Replace the whole scene with the new scene.
– Insert node in a grouping node.
– Delete a node.
– Change a field value.
• BIFS Anim (used for continuous animation of the
scene.) allows modification of any value in the
scene : viewpoints - transforms - colors - lights
Version 2
• Intellectual Property Management &
Protection (IPMP)
• Advanced BIFS
• MPEG-4 File Format
• MPEG-J
• Coding of 3D Meshes
• Body Animation
Advanced BIFS
• Multi-user functionality to access the same scene.
• Advanced audio BIFS for more natural sounds,
and sound environment modeling (air absorption,
natural distance attenuation).
• Face and body animation.
• Proto and Externproto and Script VRML
constructs.
• Other VRML nodes not included in version 1.
MPEG-4 File Format (MP4)
• Designed to contain the media information
of an MPEG-4 presentation in a flexible
extensible format that facilitates
interchange, management, editing and
presentation.
• The design is based on QuickTime® format.
• Composed of object-oriented structures
called atoms.
MPEG-J
• Specification of Java API’s in MPEG-4 System
(Scene Graph, Resource Manager, …etc)
• Contents creator may embed complex control and
data processing mechanisms to intelligently
manage the operation of the audio-visual session.
• Java application is delivered as a separate ES to
the terminal then directed to the MPEG-J run time
environment.
Coding of 3D Meshes
• Coding of generic 3D meshes to efficiently
code synthetic 3D objects.
• LOD (Level of Detail) scalability to reduce
rendering time for objects that are distant
from the viewer.
• 3D progressive geometric meshes (temporal
enhancement of 3D mesh).
Body Animation
• A body is an object capable of producing
virtual body models and animations in form
of a set of 3D polygon meshes ready for
rendering.

More Related Content

PPT
multimedia mpeg-7
PDF
MPEG-4 Developments
PPTX
3DgraphicsAndAR
PPTX
Augmented Reality: Connecting physical and digital worlds
PDF
Performance evaluation of mpeg 4 video transmission over ip-networks
PDF
11.performance evaluation of mpeg 0004www.iiste.org call for-paper video tran...
PPT
Video and animation
multimedia mpeg-7
MPEG-4 Developments
3DgraphicsAndAR
Augmented Reality: Connecting physical and digital worlds
Performance evaluation of mpeg 4 video transmission over ip-networks
11.performance evaluation of mpeg 0004www.iiste.org call for-paper video tran...
Video and animation

Similar to MPEG-4-WWW.ppt (20)

PPT
Mpeg4copy 120428133000-phpapp01
PDF
CS-324-6-2.pdf
PDF
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
PPTX
Multimedia authoring and user interface
PDF
Video Hyperlinking Tutorial (Part B)
PPT
Introduction to Video Compression Techniques - Anurag Jain
PDF
Tutorial MPEG 3D Graphics
PPT
Mmclass6
PPT
mpeg4copy-120428133000-phpapp01.ppt
PPT
Mpeg 7 slides
PPT
MPEG4 vs H.264
PDF
White Paper - Mpeg 4 Toolkit Approach
PPT
Location based VoD
PPT
Lecture 6 -_presentation_layer
PPTX
UNIT-4 TEXT and image classification.pptx
PPTX
LORENZ Building an integrated digital media archive and legal deposit
PPTX
DeepFak.pptx asdasdasdasdasdasdasdasdasd
PPTX
Automated Video Analysis and Reporting for Construction Sites
PPTX
Streaming video to html
PPTX
MAT Chapter 1
Mpeg4copy 120428133000-phpapp01
CS-324-6-2.pdf
Interactive Content Authoring for A153 ATSC Mobile Digital Television Employi...
Multimedia authoring and user interface
Video Hyperlinking Tutorial (Part B)
Introduction to Video Compression Techniques - Anurag Jain
Tutorial MPEG 3D Graphics
Mmclass6
mpeg4copy-120428133000-phpapp01.ppt
Mpeg 7 slides
MPEG4 vs H.264
White Paper - Mpeg 4 Toolkit Approach
Location based VoD
Lecture 6 -_presentation_layer
UNIT-4 TEXT and image classification.pptx
LORENZ Building an integrated digital media archive and legal deposit
DeepFak.pptx asdasdasdasdasdasdasdasdasd
Automated Video Analysis and Reporting for Construction Sites
Streaming video to html
MAT Chapter 1
Ad

Recently uploaded (20)

PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PDF
My India Quiz Book_20210205121199924.pdf
PPTX
Computer Architecture Input Output Memory.pptx
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PPTX
Introduction to pro and eukaryotes and differences.pptx
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
Trump Administration's workforce development strategy
PDF
International_Financial_Reporting_Standa.pdf
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Share_Module_2_Power_conflict_and_negotiation.pptx
Uderstanding digital marketing and marketing stratergie for engaging the digi...
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
My India Quiz Book_20210205121199924.pdf
Computer Architecture Input Output Memory.pptx
What if we spent less time fighting change, and more time building what’s rig...
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
FORM 1 BIOLOGY MIND MAPS and their schemes
Introduction to pro and eukaryotes and differences.pptx
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Environmental Education MCQ BD2EE - Share Source.pdf
Paper A Mock Exam 9_ Attempt review.pdf.
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
Trump Administration's workforce development strategy
International_Financial_Reporting_Standa.pdf
Ad

MPEG-4-WWW.ppt

  • 1. Background • MPEG : Moving Picture Experts Group: a working group of ISO/TEC “Compactly representing digital video and audio signal for consumer distribution” • MPEG-1: Standard for storage and retrieval of audio and video on storage media • MPEG-2: Standard for digital TV
  • 2. Scope of MPEG-4 Standard • Author: greater production flexibility and reusability • Network Service Provider: Offering transport information which can be interpreted on various network platforms • End user: Higher levels of interaction with content within the limits set by the author.
  • 3. Objectives • Interactivity : Interacting with the different audio-visual objects • Scalability : Adopting contents to match bandwidth • Reusability : For both tools and data
  • 4. Objectives - Interactivity • Client Side Interaction – Manipulating scene description and properties of audio- visual objects • Audio-Visual Objects Behavior – Triggered by user actions and other events • Client Server Interaction – In case a return channel is available
  • 5. Objectives - Scalability • Scalability refers to the ability to only decode a part of a bitstream and reconstruct images or image sequences with: – Reduced decoder complexity (reduced quality) – Reduced spatial resolution – Reduced temporal resolution • A scalable object is the one that has basic-quality information for presentation. When enough bitrate or resources can be assigned, enhancement layers can be added for improving quality.
  • 6. Objectives – Scalability (Cont.) • Scalability is a key factor in many applications: making moving video possible at very low bitrates notably for mobile devices • MPEG-4 has been found usable for streaming wireless video transmission at 10Kbps in GSM. • Low bitrates are accommodated by the use of scalable objects.
  • 7. Objectives - Reusability • Authors can easily organize and manipulate individual components and reuse existing decoded objects. • Each type of content can be coded using the most effective algorithms.
  • 8. Requirements • Traditional Requirements (MPEG-1 & 2) – Streaming : for live broadcast – Synchronization : to process data received at the right instants of time – Stream Management : to allow the application to consume the content (content type, dependencies…etc.) – Intellectual Property Management • Specific MPEG-4 Requirements – Audio-Visual objects – Scene description
  • 9. Audio-Visual Objects • The representation of a natural or synthetic object that has an audio and/or visual manifestation • Examples: – Video Sequence (with Shape information). – Audio Track – Animated 3D face – Speech synthesized from text. • Advantages: Interaction – Scalability – Reusability
  • 10. Scene Description The coding of information that describes the spatio-temporal relationships between the various audio-visual objects.
  • 11. Scene Description (Cont.) • Place media objects anywhere in a given coordinate system. • Apply transforms to change the geometrical or acoustical appearance of a media object. • Group primitive media objects to form compound media objects. • Apply streamed data to media objects to modify their attributes (sound, moving texture…) • Change, interactively, the user’s viewing and listening point anywhere in the scene.
  • 12. Logical Structure of a Scene Person Background Video Shape Voice Scene Synthetic Objects Ball Table
  • 13. Scene Description (Cont.) • Starting from VRML, MPEG has developed a binary language called BInary Format for Scenes (BIFS). • The standard differentiates parameters used to improve the coding efficiency of an object (motion vectors in video coding), and the ones used as modifiers of an object (its position in the scene) • Modification in the latter set does not imply re- decoding the primitive media objects.
  • 14. MPEG-4 Mission Develop a coded, streamable representation for audio-visual objects and their associated time-variant data along with a description of how they are combined.
  • 15. MPEG-4 Mission (Cont.) • Coded Vs. Textual • Streamable Vs. Downloaded • Audio-Visual objects Vs. Individual Audio or Visual Streams
  • 16. Object Model • Visual objects in the scene are described mathematically and given a position in two or three dimensional space. Similarly, audio objects are placed in sound space. • “Create once, access everywhere” ..objects are defined once and the calculations to update the screen and sound are done locally.
  • 17. Objectifying the Visual • Classical video (from the camera) is one of the visual objects defined in the standard. • Objects with arbitrary shapes can be encoded apart from their background and can be described in two ways. – Binary Shape: for low bitrate environments – Gray Scale (Alpha Shape): for higher quality content.
  • 18. Objectifying the Visual (Cont.) • MPEG does not specify how shapes are to be extracted. Current methods still have limitations (e.g. Weatherman). • MPEG-4 specifies only the decoding process. Encoding is left to the market place.
  • 19. 2D Animated Meshes • A 2D mesh is a partition of a 2D planar region into polygonal patches. • A 2D dynamic mesh refers to a 2D mesh geometry and motion information.
  • 20. 2D Animated Meshes (Cont.) • The most entertaining feature in MPEG-4 is the ability to map images onto computer generated shapes (meshes currently 2D and 3D in the next version). • A few parameters to deform the mesh can create the impression of moving video from a still video (e.g. a waving flag). • Predefined faces are particularly interesting meshes. Any feature (lips or eyes) may be animated by special commands that make them move in synchronization with speech.
  • 21. System Architecture • Streaming data for media objects. • Different architecture layers – Delivery layer – Sync layer – Compression layer – Composition layer • Syntax Description
  • 22. Streaming data for media objects • Needed data for media objects can be conveyed in one or more Elementary Streams (ESs). • An Object Descriptor (OD) identifies all streams associated with one media object. • OD contains a set of descriptors that characterized the ESs (required decoder resources, encoder timing,..)
  • 23. FlexMux FlexMux Various Transport Protocols SL SL SL Scene ODs Sync Layer Compression Layer Delivery Layer FlexMux SL Composition Layer DAI DNI ES ES ES ES SL
  • 24. Delivery Layer • Contains two-layer multiplexer – FlexMux: a tool defined according to the DMIF (Delivery Multimedia Integration Framework). It allows grouping of ESs with a low overhead. – TranMux: the second layer that offers transport service interfaces with different transport protocols (UDP/IP- MPEG-2,….)
  • 25. DMIF • A session protocol for the management of multimedia streaming over generic delivery technologies.It is similar to FTP. • Actions: – Setup a session with the remote side – Select streams and request streaming them – The peer will return pointers to the streams connections – Establish the connection themselves. • User can specify QoS the DAI. It is up to the DMIF to ensure the satisfaction of these requirements.
  • 26. Delivery layer (Cont.) • The functionality of the DMIF is expressed by an interface called DMIF Application Interface (DAI) • DAI defines a single, uniform interface to access multimedia contents on a multitude of delivery technologies. • DAI is the reference point at which the elementary streams can be accessed as Sync layer – packetized streams. • Sync layer talks to the delivery layer through DAI.
  • 27. Sync Layer • SL A flexible and configurable packetization facility that allows: Timing, Fragmentation, and continuity information on associated data packets. (Packetized Elementary Streams) • It does not provide frame information (no packet length in header). Delivery layer will do it.
  • 28. Sync Layer Functionality • Identifying time stamped Access Units (data units that comprise complete representation unit). • Each packet is an access unit or a fragment of an access unit. • These access units forms the only semantic structure of ESs in this layer. • Stamping access units includes timing information for decoding and composition. • SL retrieves ESs from packetized ESs.
  • 29. Compression Layer • The streams are sent to their respective decoders that process the data and produce composition units. • In order to relate ESs to media objects Object Descriptors (OD) are used to convey information about the number and properties of a set of ESs that belongs to a media object.
  • 30. Compression Layer (Cont.) • Scene Description: Defines – The spatial and temporal position of the various objects – The objects dynamic behavior – Interactivity features • The scene description contains unique identifiers that point to object descriptors. • Tree structured and based on VRML
  • 31. Composition Layer • Using scene description and decoded audio- visual object data to render the final scene presented to user. • MPEG-4 does not specify how information is rendered • Composition is performed at the receiver
  • 32. System Principles ESD ESD ESD OD OD Scene Descriptor ES Object Descriptor ES ES ES ES Scene
  • 34. Syntax Description • MPEG-4 defines a syntactic description language (MSDL) to describe the exact binary syntax for bitstreams carrying media objects and for bitstreams with scene description information • This language is an extension of C++, and is used to describe the syntactic representation of objects
  • 35. Tools • Stream Management: The Object Description Framework (ODF) • Timing and synchronization: The System Decoder Model (SDM) • Presentation Engine: (BIFS)
  • 36. Tools - ODF • Provides the glue between the scene description and the elementary streams. • Unique identifiers are used in the scene description to point to the OD. • The OD is a structure that encapsulates the setup and association information for a set of ES’s. • OD’s are transported in dedicated ES’s called Object Descriptor Streams (ODS). • This makes it possible to associate timing information to a set of OD’s. • Provides mechanisms to describe a hierarchical relations between streams reflecting scalable encoding of the content.
  • 37. Tools - ODF (Cont.) • The initial OD, a derivative of the object descriptor is a key element necessary for accessing MPEG-4 content. • Contains at least two elementary stream descriptors: – One point to the scene description stream. – Others may point to object descriptor stream.
  • 38. Initial Object Descriptor ESD ESD ESD OD OD Scene Descriptor ES Object Descriptor ES Scene Scene Descriptor Stream Object Descriptor Stream Initial OD
  • 39. Tools - SDM • An adaptation for the MPEG-2 System Target Decoder (that describes temporal and buffer constraints for packetizing ES’s). • MPEG-4 chose not to define multiplexing constraints in the SDM. • SDM assumes the concurrent delivery of already demultiplexed ES’s to the decoder buffer.
  • 40. Tools - BIFS • A set of nodes to represent the primitive scene objects, the scene graph constructs, the behavior and activity. • BIFS scene tells where and when to render the media
  • 41. Tools - BIFS (Cont.) • Used to describe scene decomposition information. – Spatial and Temporal locations of objects. – Object attributes and behavior. – Relationships between elements in the scene graph. • Relies heavily on VRML.
  • 42. VRML A file format for describing 3D interactive worlds(scenes) and objects. It may be used in conjunction with the WWW. It may be used to create 3D representation of complex scenes as in virtual reality representation.
  • 43. VRML Example – Shape node shape{ geometry IndexedFaceSet{ coordindex [0, 1, 3, -1, 0, 2, 5, -1, …] coord Coordinate {point[0.0 5.0 ..]} color Color {rgb [0.2 0.7…]} normal Normal {vector[0.0 1.0 0.0 ..]} textCoord Texture Coordinate {point [0 1.0 ,..]} } appearance Appearance {material Material {transperancy 0.5}} }
  • 44. BIFS vs. VRML • VRML lacks important features: – The support of natural audio and video. – Timing model is loosely specified. – VRML worlds (scenes) are often very large. • BIFS is a superset of VRML. – A binary format not a textual format (shorter) – Real-time streaming – Definition of 2D objects – Facial Animation – Enhanced Audio
  • 45. BIFS Protocols • BIFS Scene Compression (text vs. binary) • BIFS Command (produce unique-time events for the scene.) – Replace the whole scene with the new scene. – Insert node in a grouping node. – Delete a node. – Change a field value. • BIFS Anim (used for continuous animation of the scene.) allows modification of any value in the scene : viewpoints - transforms - colors - lights
  • 46. Version 2 • Intellectual Property Management & Protection (IPMP) • Advanced BIFS • MPEG-4 File Format • MPEG-J • Coding of 3D Meshes • Body Animation
  • 47. Advanced BIFS • Multi-user functionality to access the same scene. • Advanced audio BIFS for more natural sounds, and sound environment modeling (air absorption, natural distance attenuation). • Face and body animation. • Proto and Externproto and Script VRML constructs. • Other VRML nodes not included in version 1.
  • 48. MPEG-4 File Format (MP4) • Designed to contain the media information of an MPEG-4 presentation in a flexible extensible format that facilitates interchange, management, editing and presentation. • The design is based on QuickTime® format. • Composed of object-oriented structures called atoms.
  • 49. MPEG-J • Specification of Java API’s in MPEG-4 System (Scene Graph, Resource Manager, …etc) • Contents creator may embed complex control and data processing mechanisms to intelligently manage the operation of the audio-visual session. • Java application is delivered as a separate ES to the terminal then directed to the MPEG-J run time environment.
  • 50. Coding of 3D Meshes • Coding of generic 3D meshes to efficiently code synthetic 3D objects. • LOD (Level of Detail) scalability to reduce rendering time for objects that are distant from the viewer. • 3D progressive geometric meshes (temporal enhancement of 3D mesh).
  • 51. Body Animation • A body is an object capable of producing virtual body models and animations in form of a set of 3D polygon meshes ready for rendering.