Showing posts with label NPU. Show all posts
Showing posts with label NPU. Show all posts

Tuesday, December 6, 2016

Mellanox NPS-400 Processor Deliver 400 Gbps

Mellanox Technologies reported unprecedented packet processing performance of its NPS-400 Network Processor when using its newly released Deep Packet Inspection and Stateful Packet Processing software libraries.

Mellanox said these new software libraries, coupled with the hardware acceleration capabilities of the NPS-400, enable Deep Packet Inspection processing for application recognition at record breaking processing rates of up to 400Gbps, in conjunction with handling of 100 million flows with an average packet size of 400 bytes.

These processing capabilities could be used for Intrusion Detection Systems and Intrusion Prevention Systems and to accelerate processing capabilities for switch routers.

“Qosmos is very excited to collaborate with Mellanox providing a record breaking performance of Stateful Packet Processing and Deep Packet Inspection at 400Gb/s on the Mellanox NPS-400 solution,” said Thibaut Bechetoille, CEO of Qosmos. “Deep Packet Inspection drives L7 applications intelligence in the network and we expect further deployment of L7 services at more and more places in the network.”

Thursday, September 12, 2013

Cisco Unveils 400G Network Processor for SDN

Cisco unveiled its nPower X1 integrated network processor -- the first generation of a new line of custom-developed silicon designed for programmatically handling high volumes of transactions at high data rates.

Cisco said its nPower X1 is purpose-built for software-defined networking (SDN), enabling on-the-fly reprogramming for new levels of service agility and simplified network operation.  

Key features include:
  • 400 Gigabits-per-second (Gbps) throughput to enable multi-terabit network performance.  All packet processing, traffic management and input/output functions are integrated on a single nPower X1 and operate at high performance and scale.
  • Highest-performing programmable control designed to seamlessly handle hundreds of millions of unique transactions per second.  The nPower's industry-leading processing architecture is purpose built for machine-driven events and ultra-high-definition video applications.
  • 4 billion transistors on a single chip for performance, functionality, programmability, and scale.
  • Enables solutions with eight times the throughput and one quarter the power per bit compared with Cisco's previous industry-leading network processor. 

In a blog post, Nikhil Jayaram, VP of Engineering, writes:

"With over 4 billion transistors, this highly integrated 400 Gbps throughput single-chip will enable Terabit class solutions. It has sophisticated programmable control using open APIs and advanced compute operations that makes it ideal for software defined networks while handling extremely high event rates. It will help simplify network operations and allow new business models while it enables our customers to both support rapid bandwidth growth and transform the Internet."

Monday, August 19, 2013

Marvell's New Prestera Switching Processors Target Dynamic Access/Aggregation

Marvell introduced its Prestera DX4200 series of packet processors for the access and aggregation layers of fixed and mobile networks.

The new product family, which represents the eight generation of Marvell switching silicon, is implemented in 28nm. The design integrages multi-core ARM v7 CPUs, a carrier grade traffic manager and a flexible IPv6 packet processing pipeline to enable dynamic software defined networking and advanced service virtualization.

It supports CAPWAP, MPLS, VPLS, OAM, SPB and Bridge Port Extension, and offer synchronization features. An integrated InterLaken interface also enables the development of transport and circuit switched solutions while leveraging the service enabling paradigms of the DX4200. The integrated traffic manager offers hierarchical flow based quality of service and massive external buffering enabling tens of thousands of applications and users through unique queuing schemes that insure no variance in user experience across different access models. Sampling begins in September.

"As demand for higher service density per watt increases, Marvell is uniquely positioned to offer platforms for the software defined storage, networking, mobile and compute clouds being designed today," said Ramesh Sivakolundu, vice president for the Connectivity, Services and Infrastructure Business Unit (CSIBU) at Marvell Semiconductor.

Tuesday, June 4, 2013

Caviums Debuts OCTEON III MIPS64 Multicore Processors

Cavium introduced two new families of 28nm dual and quad core OCTEON III MIPS64 multicore processors designed for a broad range of applications, including Wired and Wireless Gateways, 802.11ac access points, Switches, Routers, UTM/Security Appliances, Network Attached Storage, etc.

The new processors include up to 4 MIPS64 cores with full hardware virtualization, Deep Packet Inspection (DPI), Packet processing, Security and QOS capabilities in a highly integrated System on a Chip.

Cavium’s new OCTEON III single, dual and quad-core SoCs are available in a small foot print plastic LBGA package and are targeted for low power and low cost applications. The new designs support full hardware virtualization that allows applications to be 'firewalled' from each other, legacy and newer operating systems to run concurrently and also enables live in-service upgrades. The broad range of connectivity options includes GbE, 10GbE XAUI, SATA 3.0, USB 3.0 and PCIe controllers.

Cavium has also included sophisticated power management techniques including "Power Optimizer" to reduce active power and support for Hibernate, Standby and Idle modes of operations that help slash overall power consumption.

Commercial availability is expected in Q3.

Wednesday, October 10, 2012

Tilera Debuts 64-bit, 9-Core Processor for Network Appliances

Tilera unveiled a 64-bit 9-core processor targeted at applications in networking, multimedia, storage and general purpose computing.

The TILE-Gx9 integrates a high-performance memory controller, Ethernet and PCI Express interfaces, and available crypto and compression engines to reduce both system cost and circuit board area. The multicore performance enables the device to power 10 Gbps routers and firewalls.

The TILE-Gx9 is fabricated in a 40 nanometer technology.  Its 3 x 3 array of three-issue, 64-bit cores are connected through Tilera’s patented iMesh on-chip network that supports an advanced virtual memory system.  Each core includes 32 kilobytes (KB) of L1 I-cache, 32 KB of L1 D-cache and 256 KB L2 cache, with 2.3 megabytes L3 coherent cache across the device. Maximum processor utilization is ensured with an on-board 72-bit DDR3 memory controller that supports up to 1600 Mt/s speeds and 64 GBytes total capacity.

“We are seeing enormous market traction and design win activity with our 16 and 36-core TILE-Gx processors, unseating embedded processors and other difficult-to-program devices,” said Devesh Garg, president and CEO at Tilera.

Monday, October 1, 2012

Broadcom Samples 28nm XLP 200-Series Multicore Processor

Broadcom has begun sampling its 28nm XLP 200-Series network processor for enterprise, service provider 4G/LTE, data center, cloud computing and software defined networking (SDN) equipment.  The processor family, which is the world's first 28nm multicore communications processor family, promises up to 400 percent faster performance than competing solutions while lowering power consumption by up to 60 percent. 

Broadcom said the product launch demonstrates its successful integration of NetLogic Microsystems, which developed the XLP processors.

The new processors combine quad issue, quad threading and 2 GHz out-of-order execution capabilities with integrated networking and security acceleration.  The XLP 200-Series is the first to integrate a grammar processing engine, a fourth generation regular expression (RegEx) engine, and a broad range of autonomous encryption and authentication processing engines to deliver comprehensive Layer 7 deep-packet inspection (DPI) capabilities and complete offload of the compute-intensive security functions from the CPU cores. 
Some key capabilities
  • Quad-issue, quad-threading and out-of-order execution
  • Total Security Acceleration Technology: High performance grammar processing, DPI/RegEx engine, encryption/decryption and authentication 
  • Autonomous Acceleration Engine Modules: Offloads processing tasks, freeing up the cores to perform other compute-intensive application dependent tasks
  • Hardware Acceleration for packet ordering, network management, compression/decompression and RAID5/6 storage, etc.
  • Processor Core Enhancements for improved pre-fetch performance and branch mis-predict penalties
  • Processor Cache Architecture: MOESI+ coherent, three-level cache architecture and shared 16-way set associative Layer 3 cache
  • Memory Subsystem: On-chip DDR3 memory controller; configurable channel width (40 or 72 bits)
  • Fast Messaging Network System: Low-latency, high-speed system allows non-intrusive internal communication and control messaging among NXCPUs, acceleration engines and input/output
  • Software Development Kit (SDK): Comprehensive SDK with reference and production-ready software components accelerates time-to-market

  • In September 2011, NetLogic first unveiled its XLP II family of processors based on 28nm process technology, packing up to 80 high-performance NXCPUs per chip, and promising 5-7x performance enhancement over the existing XLP processors. NetLogic said its XLP II processor family is designed to deliver over 100 Gbps of network processing performance per device and over 800 Gbps in a clustered, fully-coherent system. The devices integrate up to 80 high-performance NXCPUs per chip, featuring an enhanced quad-issue, quad-threaded, superscalar out-of-order processor architecture capable of operating at up to 2.5 GHz to provide unmatched control and data plane processing and low-power profile. 
    NetLogic is adding innovations that improve pre-fetch performance, branch mis-predict penalties and cache access latencies. The family also significantly expands the tri-level cache architecture to over 32MB of fully coherent on-chip cache which represents over 260MB of on-chip cache in the maximum clustered configuration of 8 fully-coherent XLP II processors.

    NetLogic also introduced a second-generation high-speed Inter-chip Coherency Interface (ICI) that will enable systems designs with eight sockets of XLP II processors for scalability of up to 640 NXCPUs. Full processor and memory coherency are enabled across all 640 NXCPUs, allowing software applications to run in Symmetric Multi Processing (SMP) or Asymmetric Multi Processing (AMP) modes.

  • Earlier this year, Broadcom acquire NetLogic Microsystems in a deal valued at $3.7 billion ($50 per share) net of cash assumed.  NetLogic Microsystems, which was based in Santa Clara, California, added a number of critical new product lines and technologies to Broadcom's portfolio, including knowledge-based processors, multi-core embedded processors, and digital front-end processors.

Monday, April 23, 2012

Broadcom Announces 100 Gbps Full Duplex Network Processor

Broadcom announced its 100 Gbps full duplex network processor unit (NPU) aimed at the next wave of 100GbE optimized switches and routers for service provider networks.

The new BCM88030 family features 64 custom processors running at 1GHz and eliminates the need for many external components. Power savings are estimated at up to 80 percent per 10GbE port.

The architecture leverages extensive multi-threading and hardware acceleration for functions such as packet parsing, classification and look-ups. The BCM88030 family is completely user programmable, enabling a highly flexible forwarding implementations. The BCM88030 NPU family also includes a proprietary algorithmic look up engine using low cost DDR-3 DRAM that enables massive scale for Layer 2, IPv4 and IPv6 tables while significantly reducing system cost. Algorithmic on-chip access control list (ACL) capability is available along with seamless expansion using Broadcom's NL566xx knowledge based processor (KBP).

The BCM88030 family consists of three devices, including the 100 Gbps BCM88038 NPU, the 50 Gbps BCM88034 NPU and the BCM88032 24 Gbps NPU. All devices are now sampling with production volume slated for the second half of 2012.